LightGCN and travel predictions

# GCN's for travel recommendations

Howdy 🤠, i have been nerding over recommenders for the past few months and have come across a really interesting area of machine learning in the process. Graph Neural Networks.

For anyone reading this, i recommend this post as a really great and technical breakdown for how they work, and how convolutional neural networks play into them.

[https://distill.pub/2021/gnn-intro/ ](https://distill.pub/2021/gnn-intro/ "a gentle introduction to graph neural networks")

On my last posts, i used similarity as a metric for recommending itineraries of trips in a graph. This was cool, but i wanted to explore other means of collaborative filtering for the same effect. I did some looking online and found that for my sparse, heterogeneous graph containing travel data a potential fit for model selection was LightGCN, a graph convolutional network that excels at recommending things.

Here is the research paper for the model: [https://arxiv.org/pdf/2002.02126](https://arxiv.org/pdf/2002.02126 "LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation")

Using the same dataset as before, I tried my hand at training a new LightGCN model to predict links (or in our case), create recommendation nodes, from our graph. I will post some of the code below.

class LightGCN(nn.Module):
    &quot;&quot;&quot;
    LightGCN implementation for travel recommendation system.
    Simplified version of the original LightGCN paper for heterogeneous graphs.
    &quot;&quot;&quot;

    def __init__(
        self,
        num_users: int,
        num_items: int,
        embedding_dim: int = 64,
        num_layers: int = 3,
        dropout: float = 0.1,
        device: str = &#39;cpu&#39;
    ):
        super(LightGCN, self).__init__()

        self.num_users = num_users
        self.num_items = num_items
        self.embedding_dim = embedding_dim
        self.num_layers = num_layers
        self.dropout = dropout
        self.device = device

        # Initialize embeddings
        self.user_embedding = nn.Embedding(num_users, embedding_dim)
        self.item_embedding = nn.Embedding(num_items, embedding_dim)

        # Initialize weights
        self._init_weights()

    def _init_weights(self):
        &quot;&quot;&quot;Initialize embedding weights using Xavier initialization.&quot;&quot;&quot;
        nn.init.xavier_uniform_(self.user_embedding.weight)
        nn.init.xavier_uniform_(self.item_embedding.weight)

    def get_ego_embeddings(self):
        &quot;&quot;&quot;Get user and item embeddings.&quot;&quot;&quot;
        user_embeddings = self.user_embedding.weight
        item_embeddings = self.item_embedding.weight
        ego_embeddings = torch.cat([user_embeddings, item_embeddings], dim=0)
        return ego_embeddings

    def forward(self, edge_index: torch.Tensor) -&gt; Tuple[torch.Tensor, torch.Tensor]:
        &quot;&quot;&quot;
        Forward pass of LightGCN.

        Args:
            edge_index: Edge index tensor of shape [2, num_edges]

        Returns:
            user_embeddings: Final user embeddings
            item_embeddings: Final item embeddings
        &quot;&quot;&quot;
        # Get initial embeddings
        ego_embeddings = self.get_ego_embeddings()

        # Store embeddings for each layer
        all_embeddings = [ego_embeddings]

        # LightGCN propagation
        for layer in range(self.num_layers):
            # Normalize adjacency matrix (symmetric normalization)
            edge_index_norm = self._normalize_adj(edge_index, ego_embeddings.size(0))

            # Message passing: E^(l+1) = D^(-1/2) * A * D^(-1/2) * E^(l)
            ego_embeddings = torch.sparse.mm(edge_index_norm, ego_embeddings)

            # Apply dropout
            ego_embeddings = F.dropout(ego_embeddings, p=self.dropout, training=self.training)

            all_embeddings.append(ego_embeddings)

        # Layer combination (sum all layer embeddings)
        all_embeddings = torch.stack(all_embeddings, dim=1)
        all_embeddings = torch.sum(all_embeddings, dim=1)

        # Split back to users and items
        user_embeddings, item_embeddings = torch.split(
            all_embeddings, [self.num_users, self.num_items], dim=0
        )

        return user_embeddings, item_embeddings

    def _normalize_adj(self, edge_index: torch.Tensor, num_nodes: int) -&gt; torch.Tensor:
        &quot;&quot;&quot;
            Normalize adjacency matrix using symmetric normalization.

            Args:
                edge_index: Edge index tensor
                num_nodes: Number of nodes

            Returns:
                Normalized adjacency matrix as sparse tensor
        &quot;&quot;&quot;
        # Create adjacency matrix
        adj = torch.zeros((num_nodes, num_nodes), device=self.device)
        adj[edge_index[0], edge_index[1]] = 1.0

        # Add self-loops
        adj = adj + torch.eye(num_nodes, device=self.device)

        # Calculate degree matrix
        degree = torch.sum(adj, dim=1)
        degree_inv_sqrt = torch.pow(degree, -0.5)
        degree_inv_sqrt[torch.isinf(degree_inv_sqrt)] = 0.0

        # Symmetric normalization: D^(-1/2) * A * D^(-1/2)
        adj_normalized = torch.mm(
            torch.mm(torch.diag(degree_inv_sqrt), adj),
            torch.diag(degree_inv_sqrt)
        )

        return adj_normalized.to_sparse()

    def predict(self, user_ids: torch.Tensor, item_ids: torch.Tensor) -&gt; torch.Tensor:
        &quot;&quot;&quot;
        Predict scores for user-item pairs.

        Args:
            user_ids: User IDs tensor
            item_ids: Item IDs tensor

        Returns:
            Prediction scores
        &quot;&quot;&quot;
        user_embeddings, item_embeddings = self.forward(self.edge_index)
        user_emb = user_embeddings[user_ids]
        item_emb = item_embeddings[item_ids]

        # Inner product for prediction
        scores = torch.sum(user_emb * item_emb, dim=1)
        return scores

This is then extended inside of a wrapper class that pulls my graph's data and trains the model and predicts outcomes. Initially, i got some pretty cool results.

graph taxonomy

An example of the graph's taxonomy, with recommendations created from our LightGCN model

John Smith and his recommendations

John smith user node with two recommendations

To dig a little deeper at the qualitative side, let's look at what trip's john smith actually took that we trained our model on.

John smith node

John smith took one trip to London by flight and stayed in a hotel

If i'm being blunt these results seem pretty random. Also, my code above is not a direct adapation of the real LightGCN algorithm, it normalizes on every pass of training - while the real gcn does that once before training. Also - the real LightGCN removes feature transformation and activation functions from the code. That being said, it is pretty interesting that we can take that little bit of information about our node and make 40 or so recommendations which consist of random trips that have links to destination, accommodation and transportation nodes. The quality and confidence of these predictions are currently a little questionable but I am excited to toy around with the graph topology, potentially putting less info in metadata and more info in nodes and edges in order to improve them.

This will be particularly good because we aren't looking into metadata at all when we create embeddings of nodes / edges that we use for predictions.

Ultimately, lightGCN seems like a promising step forward for lightweight GCN models that can be used in the travel space even with my slightly hacked-together version, it’s already cranking out recommendations from a tiny slice of user history. It’s not perfect and right now it’s ignoring all the metadata. Next I want to teach it why people like certain trips, not just who went where - maybe by factoring in the metadata as embeddings while making recommendations.