Leveling up Coaching: NVTabular and PyTorch Lightning | by Dylan Valerio | Jun, 2022

June 19, 2022

1

Coaching a large and deep recommender mannequin on MovieLens 25M

NVTabular is a characteristic engineering framework designed to work with NVIDIA Merlin. It may possibly course of giant datasets typical in manufacturing recommender setups. I attempted to work with NVIDIA Merlin on free cases, however the really useful strategy appears to be the one approach ahead. However I nonetheless needed to make use of NVTabular because the worth of utilizing the GPU for information engineering and information loading may be very engaging. On this submit, I’m going to make use of NVTabular with PyTorch Lightning to coach a large and deep recommender mannequin on MovieLens 25M.

We’ve a hybrid implementation right here. Picture by Laura Musikanski on Pexels.com.

For the code, chances are you’ll test my Kaggle pocket book. We’ve plenty of elements for this implementation as listed under:

Massive chunks of the code are lifted from the NVTabular tutorial.
For the mannequin, I’m utilizing some elements of James Le’s work.
Optuna is used for hyperparameter tuning.
Coaching is completed through PyTorch Lightning.
I’m additionally leveraging CometML for metric monitoring.
The dataset I used is MovieLens 25M. It has 25 million scores and a million tag purposes utilized to 62,000 films by 162,000 customers [1].

Cool? Let’s begin!

There are a number of benefits to utilizing NVTabular. You should utilize datasets which can be bigger than reminiscence (it makes use of dask), and all processing could be accomplished within the GPU. Additionally, the framework makes use of DAGs, that are conceptually acquainted to most engineers. Our operations can be outlined utilizing these DAGs.

We’ll first outline our workflow. First, we’re going to make use of implicit scores the place 1 is a ranking of 4 and 5. Second, we’ll be changing the genres column right into a multi-hot categorical characteristic. Third, we’ll be becoming a member of the scores and the genres tables. Be aware that the >> is overloaded and behaves similar to a pipe. For those who run this cell, a DAG will seem.

Previous article5 Expertise a Cloud Architect Must Succeed

Leveling up Coaching: NVTabular and PyTorch Lightning | by Dylan Valerio | Jun, 2022

Coaching a large and deep recommender mannequin on MovieLens 25M

The subsequent steps may very well be:

References

Local weather tech startup Ecozen raises INR 54 Cr in Sequence C

Picture Recognition Algorithm Utilizing Switch Studying | by Riccardo Andreoni | Jun, 2022

TeamViewer appoints Rupesh Lunkad as India MD

LEAVE A REPLY Cancel reply

Most Popular

5 Expertise a Cloud Architect Must Succeed

The 8 biggest first-person shooters I’ve performed all through my gaming profession

AMD Ryzen 7 5700X Assessment: A Worth Lower Disguised as a New Chip

unable to organize context: path “…” not discovered | by Teri Radichel | Bugs That Chew | Jun, 2022

Recent Comments

ABOUT US

POPULAR POSTS

5 Expertise a Cloud Architect Must Succeed

The 8 biggest first-person shooters I’ve performed all through my gaming profession

AMD Ryzen 7 5700X Assessment: A Worth Lower Disguised as a New Chip

POPULAR CATEGORY