Python Tutorial: Bipartite graphs and recommendation systems

preview_player
Показать описание

---
Awesome stuff! I hope the last chapter’s introduction to bipartite graphs was informative! In this chapter, we’re going to see how bipartite graphs can be applied to recommendation system problems.

In the previous course’s final chapter, you used the unipartite version of the GitHub editing network to recommend users to connect to one another; on GitHub, users develop code with others on repositories. In this chapter, you’ve been working with the bipartite version. (We will see how the two are related later, so don’t worry about this detail for now!)

We’re now going to see how we can recommend repositories for users to work on, which is an alternative to recommending users to work with.

The concept is founded on “set overlaps” between highly similar nodes on one partition. Let’s say we have a bipartite graph as shown here. User 1 is connected to repo 2, and we want to recommend repositories that User 1 might be interested in working on.

What we can do is ask which other users are connected to repo 2 other than user 1, which in this case are user2 and user3.

Both user2 and user3 are also connected, but in this case, user3 also has another repository, repo1, that it’s connected to. In this case, we may want to thus recommend repo1 to user1 to contribute to.

One thing that may come in handy for the following exercises is the idea of using set operations in your code. Let’s say we have the graph from before represented in code - you’ll see the node list contains all 6 nodes, belonging to both the repositories and users partitions, and the edge list containing the four edges between them.

Suppose we wanted to see which neighbours are shared between two nodes, user1 and user3. Firstly, to get the neighbours as lists for user1 and user3, we call on the G dot neighbors method, which returns the list of other nodes that are connected via an edge to the node passed in. To see which neighbours are shared, we can use set intersections - casting user1_nbrs as a set, and then calling the intersection method, passing in the other container of nodes, to get the common elements. Here, the intersection of user1 and user3 neighbors is repo2 - they both contributed to repo2. We can also get the difference between the two, which is the nodes that are in the left node set that are not in the right node set. In this case, a repo that user3 contributed to that user1 didn't is repo1. As such, this is one candidate repository that could be recommended to user1.

Okay! Let’s go practice working with bipartite graphs!

#PythonTutorial #DataCamp #Intermediate #Network #Analysis #Python #Bipartite #graphs
Рекомендации по теме
join shbcf.ru