#4 Project progress: Building a basic machine learning successfully

In this journal, we will continue to build our first basic machine learning after importing and cleaning data. Besides pandas module, sklearn.feature_extraction.text and sklearn.metrics.pairwise are two more modules that we will use to import TfidfVectorizer and linear_kernel methods. 

TfidfVectorizer comes from sklearn module, that uses to convert string (words) to matrix (vector) shape. Linear_kernel also comes from sklearn module, that uses to generate the linear similarity between two vectors. Cosine_similarity method can do the same work as linear_kernel, but we use linear_kernel because computer compiles data faster. 

The third new method that we use in this project is pandas Series. We use this method to assign index for a dataframe, you can take a look at line 31. Line 36 displays the code for creating a system. We use enumerate method for enumerate similarity and title, we put list() method in front of it because we want to create it as a list. Then we use sorted method and lambda function to sort the value of scores of 10 first similar movies. We use iloc method in order to retrieve the rows of sheet. From here, I believe we can start to demo our machine!



And here we go, that's what I got!


Comments

  1. I genuinely want to hear your feedback and advice about next journal.
    1. I should write detailed deeply about Python modules, such as pandas, numpy, scikit-learn,... and practice on these modules in order to improve my knowledge about Python. Or,
    2. I should learn how to create interface for this movie recommendation machine?

    I noted that you suggested me to think about front-end, can you go more detail about this please?

    Your advice is so helpful, it motivates me to study more and more!
    Thank you so much for your time!

    Bao Huynh

    ReplyDelete
    Replies
    1. Bao,

      Very nice work! I’m glad the program worked for you.
      With regards to my comment on front-end planning: it’s more for future projects. The more detailed your project plans are, the faster and easier your projects will progress.
      For future directions, I’d like you to choose which path will be better for your future. So if learning and detailing more of python will help you most, choose that. If interfacing is the area you’d like more experience in, then go with that option. I support either direction.

      Delete

Post a Comment

Popular posts from this blog

#8 Sklearn - Python package - Model evaluate metrics for regression

#3 Project Progress: Import and Clean the Data in Python

#10 Fighting The Semantic Gap On CBIR Systems