I got most of the code in this post by prompting Grok, so thanks to the developers who made their code available. I compared the code generated by Grok for accuracy and consistency with these implementations LightGCN and SELFRec. I also checked the notes posted by CS224W: Machine Learning with Graphs.
@inproceedings{kipf2016semi,
title={Semi-supervised classification with graph convolutional networks},
author={Kipf, Thomas N and Welling, Max},
booktitle={International Conference on Learning Representations},
year={2017}
}
@inproceedings{wu2019simplifying,
title={Simplifying Graph Convolutional Networks},
author={Wu,Felix and Souza Jr,Amauri and Zhang,Tao and Goodman,Christopher and Meek,Christopher},
booktitle={International Conference on Machine Learning},
pages={6861--6871},
year={2019}
}
@inproceedings{wang2019neural,
title={Neural graph collaborative filtering},
author={Wang, Xiang and Liu, Xiangnan and Zhao, Xiang and Meng, Lican and Chen, Weinan and Zhang, Min and Xie, Yue and Li, Shaoping},
booktitle={Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval},
pages={165--174},
year={2019}
}
@inproceedings{he2020lightgcn,
title={LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation},
author={He, Xiangnan and Deng, Kuan and Xiang, Zhengyu and Wang, Yan and Liu, Yongdong and Chua, Tat-Seng},
booktitle={Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval},
pages={639--648},
year={2020}
}
The key distinction between GCN and Simplifying Graph Convolution (SGC) is the removal of non-linearities in SGC’s embedding computation. SGC simplifies the process to a single weight matrix applied to accumulated normalized adjacency.
A bipartite graph is a graph whose vertices can be separated into two sets $X$ and $Y$ in such a way that every edge in the graph has one end point in each set (Marcus, 2020, “Graph Theory: A Problem Oriented Approach”, p. 45). To avoid confusion, we will rename the two sets in our bipartite graph to users $u$ and items $i$.
Here’s the structure of the adjacency matrix for a bipartite graph.
GCN and SGC papers use $H$ for embeddings, but NGCF and LightGCN use $E$. Although they represent the same thing, $E$ can be confusing because it’s also used for the set of edges.
This dataset, sourced from MovieLens, a movie recommendation platform, provides movie ratings. We’ll be using the 100K ratings variant, available for download on Kaggle. Below is a description of the files included.
File Name | Description |
---|---|
u.data | This is the core file. It contains ratings data: user ID, movie ID, rating, timestamp |
u.genre | List of movie genres |
u.info | Summary statistics of the dataset |
u.item | Movie information: movie ID, title, release date, genres |
u.occupation | List of user occupations |
u.user | User information: user ID, age, gender, occupation, zip code |
u1.base | Training set for fold 1 of 5-fold cross-validation |
u1.test | Test set for fold 1 of 5-fold cross-validation |
u2.base | Training set for fold 2 of 5-fold cross-validation |
u2.test | Test set for fold 2 of 5-fold cross-validation |
u3.base | Training set for fold 3 of 5-fold cross-validation |
u3.test | Test set for fold 3 of 5-fold cross-validation |
u4.base | Training set for fold 4 of 5-fold cross-validation |
u4.test | Test set for fold 4 of 5-fold cross-validation |
u5.base | Training set for fold 5 of 5-fold cross-validation |
u5.test | Test set for fold 5 of 5-fold cross-validation |
ua.base | Additional training set split |
ua.test | Additional test set split |
ub.base | Another additional training set split |
ub.test | Another additional test set split |
import kagglehub
# Download latest version
path = kagglehub.dataset_download("prajitdatta/movielens-100k-dataset")
print("Path to dataset files:", path)
!pip install kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
!kaggle datasets download -d prajitdatta/movielens-100k-dataset
!unzip movielens-100k-dataset.zip
Downloading movielens-100k-dataset.zip to /content
84% 4.00M/4.77M [00:00<00:00, 5.98MB/s]
100% 4.77M/4.77M [00:00<00:00, 5.04MB/s]
Archive: movielens-100k-dataset.zip
inflating: ml-100k/README
inflating: ml-100k/allbut.pl
inflating: ml-100k/mku.sh
inflating: ml-100k/u.data
inflating: ml-100k/u.genre
inflating: ml-100k/u.info
inflating: ml-100k/u.item
inflating: ml-100k/u.occupation
inflating: ml-100k/u.user
inflating: ml-100k/u1.base
inflating: ml-100k/u1.test
inflating: ml-100k/u2.base
inflating: ml-100k/u2.test
inflating: ml-100k/u3.base
inflating: ml-100k/u3.test
inflating: ml-100k/u4.base
inflating: ml-100k/u4.test
inflating: ml-100k/u5.base
inflating: ml-100k/u5.test
inflating: ml-100k/ua.base
inflating: ml-100k/ua.test
inflating: ml-100k/ub.base
inflating: ml-100k/ub.test
import torch
import torch.nn as nn
import torch.optim as optim
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
data = pd.read_csv('ml-100k/u.data', sep='\t', names=['user_id', 'movie_id', 'rating', 'timestamp'])
print(data.shape)
print(data.head(10))
train_data = pd.read_csv('ml-100k/ua.base', sep='\t', names=['user_id', 'movie_id', 'rating', 'timestamp'])
print(train_data.shape)
print(train_data.head(10))
test_data = pd.read_csv('ml-100k/ua.test', sep='\t', names=['user_id', 'movie_id', 'rating', 'timestamp'])
print(test_data.shape)
print(test_data.head(10))
movie_data = pd.read_csv('ml-100k/u.item', sep='|', encoding='latin-1',
names=['movie_id', 'title', 'release_date'], usecols=[0, 1, 2])
print(movie_data.shape)
print(movie_data.head(10))
(100000, 4)
user_id movie_id rating timestamp
0 196 242 3 881250949
1 186 302 3 891717742
2 22 377 1 878887116
3 244 51 2 880606923
4 166 346 1 886397596
5 298 474 4 884182806
6 115 265 2 881171488
7 253 465 5 891628467
8 305 451 3 886324817
9 6 86 3 883603013
(90570, 4)
user_id movie_id rating timestamp
0 1 1 5 874965758
1 1 2 3 876893171
2 1 3 4 878542960
3 1 4 3 876893119
4 1 5 3 889751712
5 1 6 5 887431973
6 1 7 4 875071561
7 1 8 1 875072484
8 1 9 5 878543541
9 1 10 3 875693118
(9430, 4)
user_id movie_id rating timestamp
0 1 20 4 887431883
1 1 33 4 878542699
2 1 61 4 878542420
3 1 117 3 874965739
4 1 155 2 878542201
5 1 160 4 875072547
6 1 171 5 889751711
7 1 189 3 888732928
8 1 202 5 875072442
9 1 265 4 878542441
(1682, 3)
movie_id title release_date
0 1 Toy Story (1995) 01-Jan-1995
1 2 GoldenEye (1995) 01-Jan-1995
2 3 Four Rooms (1995) 01-Jan-1995
3 4 Get Shorty (1995) 01-Jan-1995
4 5 Copycat (1995) 01-Jan-1995
5 6 Shanghai Triad (Yao a yao yao dao waipo qiao) ... 01-Jan-1995
6 7 Twelve Monkeys (1995) 01-Jan-1995
7 8 Babe (1995) 01-Jan-1995
8 9 Dead Man Walking (1995) 01-Jan-1995
9 10 Richard III (1995) 22-Jan-1996
Mapping user and item IDs to continuous indices (0 to num_users-1 and 0 to num_items-1) is essential because PyTorch embeddings expect zero-based indices within bounds. Without this, raw IDs (e.g., 1 to 943) could exceed embedding sizes or cause gaps, leading to out-of-bounds errors like “index 2625 is out of bounds for dimension 0 with size 2625”.
user_ids = sorted(train_data['user_id'].unique()) # 1 to 943
user_id_mapping = {id: i for i, id in enumerate(user_ids)}
item_ids = sorted(train_data['movie_id'].unique()) # 1 to 1682
item_id_mapping = {id: i for i, id in enumerate(item_ids)}
train_data['user_id'] = train_data['user_id'].map(user_id_mapping)
test_data['user_id'] = test_data['user_id'].map(user_id_mapping)
train_data['movie_id'] = train_data['movie_id'].map(item_id_mapping)
test_data['movie_id'] = test_data['movie_id'].map(item_id_mapping)
num_users = len(user_ids)
num_items = len(item_ids)
print("Number of unique users:", num_users)
print("Number of unique items:", num_items)
print(train_data.head(10))
print(test_data.head(10))
Number of unique users: 943
Number of unique items: 1680
user_id movie_id rating timestamp
0 0 0 5 874965758
1 0 1 3 876893171
2 0 2 4 878542960
3 0 3 3 876893119
4 0 4 3 889751712
5 0 5 5 887431973
6 0 6 4 875071561
7 0 7 1 875072484
8 0 8 5 878543541
9 0 9 3 875693118
user_id movie_id rating timestamp
0 0 19.0 4 887431883
1 0 32.0 4 878542699
2 0 60.0 4 878542420
3 0 116.0 3 874965739
4 0 154.0 2 878542201
5 0 159.0 4 875072547
6 0 170.0 5 889751711
7 0 188.0 3 888732928
8 0 201.0 5 875072442
9 0 264.0 4 878542441
# Interaction tensors
train_interactions = torch.tensor(train_data[['user_id', 'movie_id']].values, dtype=torch.long)
test_interactions = torch.tensor(test_data[['user_id', 'movie_id']].values, dtype=torch.long)
# insights about train_interactions
print(train_interactions.shape)
print(train_interactions[:10])
print("Data type:", train_interactions.dtype)
print("Device:", train_interactions.device)
torch.Size([90570, 2])
tensor([[0, 0],
[0, 1],
[0, 2],
[0, 3],
[0, 4],
[0, 5],
[0, 6],
[0, 7],
[0, 8],
[0, 9]])
Data type: torch.int64
Device: cpu
We’ll compute the Adjacency $A$, Degree $D$, and Normalized Adjacency $\tilde{A}$ matrices in the following code:
# Adjacency matrix
rows = torch.cat([train_interactions[:, 0], train_interactions[:, 1] + num_users], dim=0)
cols = torch.cat([train_interactions[:, 1] + num_users, train_interactions[:, 0]], dim=0)
indices = torch.stack([rows, cols], dim=0).to(device)
values = torch.ones(indices.shape[1], device=device)
adj = torch.sparse_coo_tensor(indices, values, size=(num_users + num_items, num_users + num_items), device=device)
# Normalized adjacency matrix
degrees = torch.sparse.sum(adj, dim=1).to_dense()
norm_values = 1.0 / (torch.sqrt(degrees[rows]) * torch.sqrt(degrees[cols])).to(device)
norm_adj = torch.sparse_coo_tensor(indices, norm_values, size=(num_users + num_items, num_users + num_items), device=device)
These plots visualize the adjacency matrix. We expect a block diagonal structure due to the bipartite graph.
The LightGCN class is shown below, where the forward function handles the main calculations.
class LightGCN(nn.Module):
def __init__(self, num_users, num_items, embedding_dim, num_layers, norm_adj, device):
super(LightGCN, self).__init__()
self.num_users = num_users
self.num_items = num_items
self.embedding_dim = embedding_dim
self.num_layers = num_layers
self.device = device
self.register_buffer('norm_adj', norm_adj)
self.user_embeddings = nn.Embedding(num_users, embedding_dim)
self.item_embeddings = nn.Embedding(num_items, embedding_dim)
nn.init.normal_(self.user_embeddings.weight, std=0.01) # Initialize user embeddings with a normal distribution (mean=0, std=0.01) for small random values
nn.init.normal_(self.item_embeddings.weight, std=0.01) # Initialize item embeddings with a normal distribution (mean=0, std=0.01) for small random values
def forward(self, users, pos_items, neg_items):
all_embeddings = torch.cat([self.user_embeddings.weight, self.item_embeddings.weight], dim=0)
ego_embeddings = all_embeddings
for _ in range(self.num_layers): # Loop over the specified number of graph convolution layers
all_embeddings = torch.spmm(self.norm_adj, all_embeddings) # Perform sparse matrix multiplication with normalized adjacency matrix to propagate embeddings
ego_embeddings += all_embeddings # Add the propagated embeddings to the running sum (LightGCN aggregates all layers)
final_embeddings = ego_embeddings / (self.num_layers + 1) # Average the embeddings across all layers (including ego layer) for smoothness
user_emb = final_embeddings[users]
pos_item_emb = final_embeddings[self.num_users + pos_items] # because we sampled pos_items from items (0 to 1681) we have to offset by the number of users
neg_item_emb = final_embeddings[self.num_users + neg_items] # because we sampled neg_items from items (0 to 1681) we have to offset by the number of users
pos_scores = (user_emb * pos_item_emb).sum(dim=-1) # predicted scores of positive samples
neg_scores = (user_emb * neg_item_emb).sum(dim=-1) # predicted scores of negative samples
return pos_scores, neg_scores
def get_embeddings(self):
all_embeddings = torch.cat([self.user_embeddings.weight, self.item_embeddings.weight], dim=0)
ego_embeddings = all_embeddings
for _ in range(self.num_layers):
all_embeddings = torch.spmm(self.norm_adj, all_embeddings)
ego_embeddings += all_embeddings
final_embeddings = ego_embeddings / (self.num_layers + 1)
user_emb = final_embeddings[:self.num_users]
item_emb = final_embeddings[self.num_users:]
return user_emb, item_emb
The chart shows the process of training a recommender system, which requires identifying positive pairs from the training set and randomly generating negative pairs.
On the left, we see the positive items (user-rated movies in the training set). On the right, the plot shows the randomly generated pairs.
Here’s a distribution of positive and negative pairs.
# Hyperparameters
embedding_dim = 64
num_layers = 3
learning_rate = 1e-3
num_epochs = 1000
lambda_reg = 1e-6 # Regularization weight
# Initialize model
model = LightGCN(num_users, num_items, embedding_dim, num_layers, norm_adj, device)
model.to(device)
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
for epoch in range(num_epochs):
model.train()
total_loss = 0
users = train_interactions[:, 0].to(device)
pos_items = train_interactions[:, 1].to(device)
neg_items = torch.randint(0, num_items, (len(train_interactions),), device=device)
# Forward pass
pos_scores, neg_scores = model(users, pos_items, neg_items)
bpr_loss = -torch.log(torch.sigmoid(pos_scores - neg_scores)).mean()
# L2 Regularization
user_embeddings = model.user_embeddings.weight
item_embeddings = model.item_embeddings.weight
# Calculate L2 norm of embeddings (squared sum of elements)
reg_loss = (user_embeddings**2).sum() + (item_embeddings**2).sum()
# Combine BPR loss and regularization term with a weight factor (lambda)
loss = bpr_loss + lambda_reg * reg_loss
# Backward pass
optimizer.zero_grad()
loss.backward()
optimizer.step()
total_loss = loss.item()
print(f'Epoch {epoch + 1}/{num_epochs}, Loss: {total_loss:.4f}')
Epoch 1/1000, Loss: 0.6932
Epoch 2/1000, Loss: 0.6931
Epoch 3/1000, Loss: 0.6931
Epoch 4/1000, Loss: 0.6930
Epoch 5/1000, Loss: 0.6930
Epoch 6/1000, Loss: 0.6928
Epoch 7/1000, Loss: 0.6927
Epoch 8/1000, Loss: 0.6925
Epoch 9/1000, Loss: 0.6923
Epoch 10/1000, Loss: 0.6920
...
...
...
Epoch 991/1000, Loss: 0.3547
Epoch 992/1000, Loss: 0.3540
Epoch 993/1000, Loss: 0.3530
Epoch 994/1000, Loss: 0.3565
Epoch 995/1000, Loss: 0.3559
Epoch 996/1000, Loss: 0.3541
Epoch 997/1000, Loss: 0.3553
Epoch 998/1000, Loss: 0.3559
Epoch 999/1000, Loss: 0.3535
Epoch 1000/1000, Loss: 0.3526
Recommender systems often use the Bayesian Personalized Ranking (BPR) loss for evaluation. It includes: (1) a term that measures the difference in predicted scores between positive and negative item pairs, and (2) a regularization term, weighted by $\lambda$. The objective is to optimize the model to assign higher scores to positive items.
A common evaluation metric for recommender systems is Recall@K. It quantifies how often the system recommends relevant items within the top-K recommendations. We determine the ‘correctness’ of the recommendations by comparing them to the items a user has in the test set.
$Recall@K = \frac{\text{Number of relevant items in top K recommendations}}{\text{Total number of relevant items}}$
The MovieLens ua.test split is structured such that each user has exactly 10 ratings. Therefore, when evaluating recall, it is not meaningful to calculate recall@k for k > 10. Any recall@k where k > 10 will simply be equivalent to recall@10, as there are no additional relevant items to retrieve beyond the 10 present in the test set.
def recall_at_k(model, test_interactions, k=10):
model.eval()
with torch.no_grad():
user_emb, item_emb = model.get_embeddings()
# Compute preference scores for all users against all items via matrix multiplication
scores = user_emb @ item_emb.T
# Get the top-k item indices with highest scores for each user
_, top_k_items = torch.topk(scores, k, dim=1)
top_k_items = top_k_items.cpu().numpy()
test_user_items = {}
# Populate the dictionary with user -> set of rated movies from test_interactions
for user, movie in test_interactions:
user = user.item()
movie = movie.item()
if user not in test_user_items:
test_user_items[user] = set()
test_user_items[user].add(movie)
recall_data = []
total_recall = 0
# Create reverse mapping from mapped user IDs to original user IDs
reverse_user_mapping = {v: k for k, v in user_id_mapping.items()}
# Iterate over all possible mapped user IDs (0 to num_users-1)
for mapped_user_id in range(num_users):
if mapped_user_id in test_user_items:
true_items = test_user_items[mapped_user_id]
pred_items = set(top_k_items[mapped_user_id])
hits = len(pred_items & true_items)
num_true_items = len(true_items)
# Compute recall for this user (hits / true items), 0 if no true items
user_recall = hits / num_true_items if num_true_items > 0 else 0
# Get the original user ID from the mapped ID
original_user_id = reverse_user_mapping[mapped_user_id]
# Store user data in a dictionary
recall_data.append({
'user_id': original_user_id, # Original user ID (e.g., 186)
'mapped_user_id': mapped_user_id, # Internal index (e.g., 1)
'hits': hits, # Number of correct predictions
'true_items_count': num_true_items, # Number of test set items
'recall': user_recall # Per-user recall value
})
total_recall += user_recall
total_recall = total_recall / len(test_user_items) if test_user_items else 0
recall_df = pd.DataFrame(recall_data)
return total_recall, recall_df
total_recall, recall_df = recall_at_k(model, test_interactions, k=10)
print(f"Total Recall@10: {total_recall:.4f}")
print("\nPer-User Recall Data (first 5 rows):")
print(recall_df.head())
Total Recall@10: 0.0957
Per-User Recall Data (first 5 rows):
user_id mapped_user_id hits true_items_count recall
0 1 0 0 10 0.0
1 2 1 0 10 0.0
2 3 2 2 10 0.2
3 4 3 2 10 0.2
4 5 4 1 10 0.1
def recommend_and_compare(model, user_ids, train_data, test_interactions, movie_data, k=10):
model.eval()
recommendations = []
with torch.no_grad():
user_emb, item_emb = model.get_embeddings()
if not user_ids:
print("No users specified for recommendations.")
return pd.DataFrame(columns=['user_id', 'mapped_user_id', 'movie_id', 'mapped_movie_id', 'title', 'in_train', 'in_test'])
test_user_items = {}
# Loop through each user-movie pair in test_interactions (a tensor of [mapped_user_id, mapped_movie_id])
for user, movie in test_interactions:
user = user.item()
movie = movie.item()
if user not in test_user_items:
test_user_items[user] = set()
test_user_items[user].add(movie)
# Create a reverse mapping from mapped movie IDs to original movie IDs (e.g., 0 -> 1 for Toy Story)
reverse_item_mapping = {v: k for k, v in item_id_mapping.items()}
# Iterate over each original user ID provided in the input user_ids list
for user_id in user_ids:
if user_id not in user_id_mapping:
print(f"Warning: User ID {user_id} not found in dataset. Skipping.")
continue
mapped_user_id = user_id_mapping[user_id]
user_scores = user_emb[mapped_user_id] @ item_emb.T
# Get the top-k indices (mapped_movie_ids) with the highest scores; _ discards the scores themselves
_, top_k_indices = torch.topk(user_scores, k)
top_k_indices = top_k_indices.cpu().numpy()
user_train_ratings = set(train_data[train_data['user_id'] == mapped_user_id]['movie_id'].tolist())
user_test_ratings = test_user_items.get(mapped_user_id, set())
# Convert the top-k mapped movie IDs to their original movie IDs using reverse_item_mapping
original_movie_ids = [reverse_item_mapping[mapped_id] for mapped_id in top_k_indices]
# Filter movie_data to get titles and original movie IDs for the recommended movies
recommended_movies = movie_data[movie_data['movie_id'].isin(original_movie_ids)][['movie_id', 'title']]
# Iterate over each recommended movie in the filtered DataFrame
for idx, row in recommended_movies.iterrows():
# Extract the original movie ID from movie_data (e.g., 1 for Toy Story)
original_movie_id = row['movie_id']
# Convert the original movie ID back to its mapped ID using item_id_mapping (e.g., 1 -> 0)
mapped_movie_id = item_id_mapping[original_movie_id]
title = row['title']
in_train = mapped_movie_id in user_train_ratings
in_test = mapped_movie_id in user_test_ratings
recommendations.append({
'user_id': user_id, # Original user ID from input (e.g., 1)
'mapped_user_id': mapped_user_id, # Mapped user ID (e.g., 0)
'movie_id': original_movie_id, # Original movie ID from u.item (e.g., 1)
'mapped_movie_id': mapped_movie_id, # Mapped movie ID (e.g., 0)
'title': title, # Movie title (e.g., "Toy Story (1995)")
'in_train': in_train, # Boolean: True if movie is in training set
'in_test': in_test # Boolean: True if movie is in test set
})
rec_df = pd.DataFrame(recommendations)
return rec_df
user_ids = [1, 2, 3, 4, 5]
rec_df = recommend_and_compare(model, user_ids, train_data, test_interactions, movie_data, k=10)
rec_df
user_id | mapped_user_id | movie_id | mapped_movie_id | title | in_train | in_test |
---|---|---|---|---|---|---|
1 | 0 | 1 | 0 | Toy Story (1995) | True | False |
1 | 0 | 7 | 6 | Twelve Monkeys (1995) | True | False |
1 | 0 | 50 | 49 | Star Wars (1977) | True | False |
1 | 0 | 56 | 55 | Pulp Fiction (1994) | True | False |
1 | 0 | 98 | 97 | Silence of the Lambs, The (1991) | True | False |
1 | 0 | 100 | 99 | Fargo (1996) | True | False |
1 | 0 | 121 | 120 | Independence Day (ID4) (1996) | True | False |
1 | 0 | 127 | 126 | Godfather, The (1972) | True | False |
1 | 0 | 174 | 173 | Raiders of the Lost Ark (1981) | True | False |
1 | 0 | 181 | 180 | Return of the Jedi (1983) | True | False |
2 | 1 | 258 | 257 | Contact (1997) | True | False |
2 | 1 | 269 | 268 | Full Monty, The (1997) | True | False |
2 | 1 | 286 | 285 | English Patient, The (1996) | True | False |
2 | 1 | 288 | 287 | Scream (1996) | True | False |
2 | 1 | 294 | 293 | Liar Liar (1997) | True | False |
2 | 1 | 300 | 299 | Air Force One (1997) | True | False |
2 | 1 | 302 | 301 | L.A. Confidential (1997) | True | False |
2 | 1 | 313 | 312 | Titanic (1997) | True | False |
2 | 1 | 328 | 327 | Conspiracy Theory (1997) | False | False |
2 | 1 | 748 | 747 | Saint, The (1997) | False | False |
3 | 2 | 258 | 257 | Contact (1997) | True | False |
3 | 2 | 286 | 285 | English Patient, The (1996) | False | False |
3 | 2 | 288 | 287 | Scream (1996) | True | False |
3 | 2 | 294 | 293 | Liar Liar (1997) | False | True |
3 | 2 | 300 | 299 | Air Force One (1997) | True | False |
3 | 2 | 302 | 301 | L.A. Confidential (1997) | True | False |
3 | 2 | 307 | 306 | Devil’s Advocate, The (1997) | True | False |
3 | 2 | 328 | 327 | Conspiracy Theory (1997) | False | True |
3 | 2 | 333 | 332 | Game, The (1997) | True | False |
3 | 2 | 748 | 747 | Saint, The (1997) | False | False |
4 | 3 | 258 | 257 | Contact (1997) | True | False |
4 | 3 | 286 | 285 | English Patient, The (1996) | False | False |
4 | 3 | 288 | 287 | Scream (1996) | False | True |
4 | 3 | 294 | 293 | Liar Liar (1997) | False | True |
4 | 3 | 300 | 299 | Air Force One (1997) | True | False |
4 | 3 | 302 | 301 | L.A. Confidential (1997) | False | False |
4 | 3 | 307 | 306 | Devil’s Advocate, The (1997) | False | False |
4 | 3 | 328 | 327 | Conspiracy Theory (1997) | True | False |
4 | 3 | 333 | 332 | Game, The (1997) | False | False |
4 | 3 | 748 | 747 | Saint, The (1997) | False | False |
5 | 4 | 56 | 55 | Pulp Fiction (1994) | False | False |
5 | 4 | 69 | 68 | Forrest Gump (1994) | True | False |
5 | 4 | 98 | 97 | Silence of the Lambs, The (1991) | False | True |
5 | 4 | 168 | 167 | Monty Python and the Holy Grail (1974) | True | False |
5 | 4 | 172 | 171 | Empire Strikes Back, The (1980) | True | False |
5 | 4 | 173 | 172 | Princess Bride, The (1987) | True | False |
5 | 4 | 174 | 173 | Raiders of the Lost Ark (1981) | True | False |
5 | 4 | 204 | 203 | Back to the Future (1985) | True | False |
5 | 4 | 210 | 209 | Indiana Jones and the Last Crusade (1989) | True | False |
5 | 4 | 423 | 422 | E.T. the Extra-Terrestrial (1982) | True | False |