Deep Time Sequence Forecasting with Stream Forecast Half 1: Avocado Costs | by Isaac Godfried | Nov, 2022

November 13, 2022

31

Stream Forecast [FF], is a cutting-edge deep studying for time sequence forecasting framework in-built PyTorch. On this ongoing sequence we’ll use FF to carry out forecasts (and classification) on actual world time sequence datasets. On this first instance we’ll use FF to carry out forecasts on a publicly obtainable Avocado dataset situated on Kaggle (Open Database). Forecasting the costs of produce can show beneficial for each shoppers and producers alike as it may possibly assist decide the perfect time to purchase or anticipated income.

Dataset: This dataset comprises data on US weekly avocado gross sales from 2015–2020. Information is separated into completely different metropolitan areas equivalent to Chicago, Detroit, Milwaukee and so forth. For the dataset there are roughly 9 columns in complete. All issues thought of this a comparatively “easy” dataset to forecast on as there aren’t many lacking values within the columns.

We’ll now attempt to forecast the common value of avocados primarily based on the total_volume and several other of the opposite columns equivalent to 4046 (e.g. the amount of sure luggage bought).

It’s value noting that in actuality for a protracted vary forecasting drawback the place we forecast with multiple ahead cross by way of the mannequin (e.g. we concatenate the prediction from the mannequin for the goal to the opposite options and re-feed it into the mannequin), we’d probably must deal with issues like total_volume, 4046, as targets as properly since we wouldn’t have entry to their real-world ground-truth values a number of time-steps forward of time. Nonetheless, as a way to simplify this tutorial we’ll assume that we do (these values may additionally come from different separate estimates or different fashions).

Method 1: The primary method that we are going to attempt is DA-RNN, which is an older although nonetheless efficient deep studying for time sequence forecasting mannequin. To-do this we’ll first design a configuration file which incorporates our mannequin’s parameters:

the_config = {                 
"model_name": "DARNN",
"model_type": "PyTorch",
"model_params": {
"n_time_series":6,
"hidden_size_encoder":128,
"decoder_hidden_size":128,
"out_feats":1,
"forecast_history":5, 
"gru_lstm": False
},
"dataset_params":
{ "class": "default",
"training_path": "chicago_df.csv",
"validation_path": "chicago_df.csv",
"test_path": "chicago_df.csv",
"forecast_length": 1,
"batch_size":4,
"forecast_history":4,
"train_end": int(len(chicago_df)*.7),
"valid_start":int(len(chicago_df)*.7),
"valid_end": int(len(chicago_df)*.9),
"test_start": int(len(chicago_df)*.9),
"target_col": ["average_price"],
"sort_column": "date",
"no_scale": True,
"relevant_cols": ["average_price"j., "total_volume", "4046", "4225", "4770"],
"scaler": "StandardScaler", 
"interpolate": False,
"feature_param":
{
"datetime_params":{
"month":"numerical"
}
}
},"training_params":
{
"criterion":"DilateLoss",
"optimizer": "Adam",
"optim_params":
{"lr": 0.001},
"epochs": 4,
"batch_size":4
},
"inference_params":{
"datetime_start": "2020-11-01",
"hours_to_forecast": 5,
"test_csv_path":"chicago_df.csv",
"decoder_params":{
"decoder_function": "simple_decode", 
"unsqueeze_dim": 1
} 
},
"GCS": False,
"wandb": {
"identify": "avocado_training",
"tags": ["DA-RNN", "avocado_forecast","forecasting"],
"mission": "avocado_flow_forecast"
},
"forward_params":{},
"metrics":["DilateLoss", "MSE", "L1"]
}

On this case we’ll use the DilateLoss perform. The DilateLoss perform is a loss perform that returns an error primarily based on each the values and form of the time sequence proposed again in 2020. It’s a nice perform for coaching, however it sadly doesn’t work with each mannequin. We can even add month as a characteristic in our configuration file.

Now we’ll practice the mannequin for a number of for a number of epochs utilizing the practice perform:

from flood_forecast.coach import train_function
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
secret_value_0 = user_secrets.get_secret("WANDB_KEY")
os.environ["WANDB_API_KEY"] = secret_value_0
trained_model = train_function("PyTorch", the_config)

Now let’s analyze a number of the outcomes on Weights and Biases:

We are able to see that the mannequin appeared to converge fairly properly. We most likely may have even squeezed out some extra efficiency coaching for one more epoch or two primarily based on the validation_loss.

The predictions are proven in pink right here and the precise average_price is proven in blue.

Moreover, the prediction just isn’t terrible notably given we didn’t extensively tune the parameters. Nonetheless, on the flip aspect we are able to see the mannequin just isn’t actually utilizing the total_volume when predicting the worth (no less than in accordance with SHAP).

You may see the total code on this tutorial right here and the W&B log right here.

Method 2: We’ll use a probabilistic model of a GRU to foretell the worth of the avocados in several areas across the US.

The benefit of a probabilistic m is that it predicts the higher and decrease bounds of the forecasted worth. Once more we outline a configuration file:

the_config = {                 
"model_name": "VanillaGRU",
"model_type": "PyTorch",
"model_params": {
"n_time_series":6,
"hidden_dim":32,
"probabilistic": True, 
"num_layers":1,
"forecast_length": 2,
"n_target":2,
"dropout":0.15, 
},
"probabilistic": True,
"dataset_params":
{ "class": "default",
"training_path": "chicago_df.csv",
"validation_path": "chicago_df.csv",
"test_path": "chicago_df.csv",
"forecast_length": 2,
"forecast_history":5,
"train_end": int(len(chicago_df)*.7),
"valid_start":int(len(chicago_df)*.7),
"valid_end": int(len(chicago_df)*.9),
"test_start": int(len(chicago_df)*.9),
"target_col": ["average_price"],
"sort_column": "date",
"no_scale": True,
"relevant_cols": ["average_price", "total_volume", "4046", "4225", "4770"],
"scaler": "StandardScaler", 
"interpolate": False,
"feature_param":
{
"datetime_params":{
"month":"numerical"
}
}
},"training_params":
{
"criterion":"NegativeLogLikelihood",
"optimizer": "Adam",
"optim_params":
{"lr": 0.001},
"epochs": 5,
"batch_size":4
},
"inference_params":{
"probabilistic": True,
"datetime_start": "2020-11-01",
"hours_to_forecast": 5,
"test_csv_path":"chicago_df.csv",
"decoder_params":{
"decoder_function": "simple_decode", "unsqueeze_dim": 1, "probabilistic": True}
},
"GCS": False,"wandb": {
"identify": "avocado_training",
"tags": ["GRU_PROB", "avocado_forecast","forecasting"],
"mission": "avocado_flow_forecast"
},
"forward_params":{},
"metrics":["NegativeLogLikelihood"]
}

Right here we’ll use NegativeLogLikelihood loss for our loss perform. It is a particular loss perform for probabilistic fashions. Now like with the earlier mannequin we are able to study the outcomes on Weights and Biases.

Once more the coaching appears to go principally properly.

Right here the mannequin has each an higher and decrease bounds and a predicted imply. We are able to see the mannequin is pretty good with the anticipated imply however nonetheless pretty unsure with its higher and decrease boundaries (even having a damaging decrease boundary).

You may see the total code in this tutorial pocket book.

Method 3: We are able to now attempt utilizing a single NN to foretell a number of geographic areas without delay. To-do this we’ll use a easy transformer mannequin. Just like the final two fashions we outline a configuration file:

the_config = {                 
"model_name": "CustomTransformerDecoder",
"model_type": "PyTorch",
"model_params": {
"n_time_series":11,
"seq_length":5,
"dropout": 0.1,
"output_seq_length": 2, 
"n_layers_encoder": 2,
"output_dim":2,
"final_act":"Swish"
},
"n_targets":2,
"dataset_params":
{  "class": "default",
"training_path": "multi_city.csv",
"validation_path": "multi_city.csv",
"test_path": "multi_city.csv",
"sort_column": "date",
"batch_size":10,
"forecast_history":5,
"forecast_length":2,
"train_end": int(len(merged_df)*.7),
"valid_start":int(len(merged_df)*.7),
"valid_end": int(len(merged_df)*.9),
"test_start": int(len(merged_df)*.9),
"test_end": int(len(merged_df)),
"target_col": ["average_price_ch", "average_price_dt"],
"relevant_cols": ["average_price_ch", "average_price_dt", "total_volume_ch", "4046_ch", "4225_ch", "4770_ch", "total_volume_dt", "4046_dt", "4225_dt", "4770_dt"],
"scaler": "MinMaxScaler",
"no_scale": True,
"scaler_params":{
"feature_range":[0, 2]
},
"interpolate": False,
"feature_param":
{
"datetime_params":{
"month":"numerical"
}
}
},
"training_params":
{
"criterion":"MSE",
"optimizer": "Adam",
"optim_params":
{
"lr": 0.001,
},
"epochs": 5,
"batch_size":5},
"GCS": False,
"wandb": {
"identify": "avocado_training",
"tags": ["multi_trans", "avocado_forecast","forecasting"],
"mission": "avocado_flow_forecast"
},
"forward_params":{},
"metrics":["MSE"],
"inference_params":
{     
"datetime_start":"2020-11-08",
"num_prediction_samples": 20,
"hours_to_forecast":5, 
"test_csv_path":"multi_city.csv",
"decoder_params":{
"decoder_function": "simple_decode", 
"unsqueeze_dim": 1},
}
}

For this mannequin we’ll return to utilizing MSE for our loss perform. We are able to now analyze the outcomes from W&B.

The mannequin appears to converge properly (although it may not have sufficient knowledge as transformers require massive quantities).

The inexperienced shaded space is the boldness interval.

The Chicago mannequin seems a bit off nevertheless, with some extra hyper-parameter tuning it may probably carry out properly (particularly extra dropout).

Full code

Weights and Biases log

Conclusion

Right here we noticed the outcomes of three completely different fashions with respect to forecasting the costs of avocados over a 5 week interval. FF makes it simple to coach many several types of fashions for forecasting and to see which performs the perfect. Partly two of this sequence will go over forecasting grocery gross sales.

Previous articleCompliancy Group Urges Healthcare Organizations to Full Their HIPAA Safety Threat Assessments

Next articlelocalhost – How do I add my WordPress code information from earlier web site onto native host WordPress web site?

Deep Time Sequence Forecasting with Stream Forecast Half 1: Avocado Costs | by Isaac Godfried | Nov, 2022

Variance Discount with Significance Sampling | by Herman Michaels | Jan, 2023

High 5 Papers Introduced at MLDS 2023

The Fault in AI Predictions: Why Explainability Trump’s Predictions

LEAVE A REPLY Cancel reply

Most Popular

AI is shifting focus from syntax to vital considering

A number of fashionable SQL dialects have began introducing lambda expressions

Case Research: Combining Chopping-Edge CSS Options Right into a “Course Navigation” Part

Straightforward method to add, rework and ship recordsdata and pictures

Recent Comments

ABOUT US

POPULAR POSTS

AI is shifting focus from syntax to vital considering

A number of fashionable SQL dialects have began introducing lambda expressions

Case Research: Combining Chopping-Edge CSS Options Right into a “Course Navigation” Part

POPULAR CATEGORY