Datasets
Properly preparing your data is essential to ensure accurate and reliable results. This includes understanding the content, structure, and expected format of your dataset. A few restrictions apply to guarantee that Studio interprets your data correctly.
Studio is designed to work with time-series data, so each dataset is treated as a time series by default. Timestamps must be unique within a dataset; if duplicates exist, Studio will use the first occurrence along with its corresponding observations and ignore any subsequent duplicates.
Sampling Rate
Sampling rate is defined as the interval of constant samples present within an time-series per a unit of time. Therefore, the sampling period is defined as the difference between two consecutive samples or observations in an time-series.
Missing Data
The term missing data refers to gaps or absent values within a dataset, where there is no valid measurement or recorded entry for a given variable. This typically occurs due to errors, corruption, or incompatiblity.
In practice, this appears as:
- Empty cells (
"") - Strings such as
na,nan,n/a,missing, ornull - Non-numeric values in numeric columns
- Values that are infinite or cannot be parsed
Missing data values are automatically detected and hanlded within Studio, ensuring that your models can still be built and that your analysis remains robust even in the face of incomplete data.
Dataset Size
You can upload a file with up to 40,000 rows of data. Keeping your file underneath this limit ensures smoothing processing and fast validation.
For example, three years of hourly time-series data provides the ideal balance of comprehrensive seasonality and practicality. The large timeframe esnures that there are sufficient seasonal patterns for observation, for instance a daily cycle repeats over 1000 times whilst a weekly pattern occurs 156 times. Three annual cycles also provides annual seasonality which is key in robust trend detection and seasonal decomposition. Capturing three years of data helps Studio distinguish true seasonal effects from random fluctuations, allowing year-on-year trends to be identified with greater statistical confidence. The image below illustrates an example of three years of hourly time-series data.

Expected Dataset Structure
The table below showcases the expected columns
Data Types
At its core, Studio supports two key data types; numerical and categorical. Numerical data consists of values that can be measured and represented using numbers, this may include age, temperature, or height. Categorical data is the opposite of this wherein it is consists of values that represent categories or groups.
Within the world of time-series data there are numerous types of variables. Numerical variables can be represented by numbers and can be measured, or are quantifiable. This includes variables that continuous, such as pressure, as well as variables that are discrete, such as number of items sold
Studio also supports Boolean variables. Boolean variables are a special subtype of categorical data in which there are only two possible values (0/1 or True/False). Other categorical varibales such as Nominal, Ordinal, and Multi-class categorical are also supported.
Adding a Dataset
Adding a new dataset in Studio is simple. From the datasets page, first click the Add Dataset button. This will open the Add Dataset popup.
There are two steps to adding a dataset:
Upload File
Upload a file containing time-series data
Create Dataset
Once validated, configure and personalise this uploaded data
Upload File
To be able to create a dataset to Studio, there are some requirements that your dataset must meet. Studio only supports CSV files. Your file must have a column named Timestamp with the format YYYY-mm-dd HH:MM:SS UTC as a column.

Once, you have selected and uploaded, it is validated. The validation process checks that uploaded files are correctly formatted and ready to be used for forecasting. When a file is submitted, it is automatically reviewed to confirm that required columns are present, timestamps can be read correctly, and the data is complete and consistent. The system also verifies that values are numeric where expected and that the time intervals make sense for analysis. If any issues are found, the validation will highlight them so they can be fixed before continuing. Once the dataset passes these checks, the pop-up will automatically take you to the second step of the process - Create Dataset
Create Dataset
Once your uploaded file has been successfully validated, you can now name your dataset and add a brief description of your dataset. As a default, Studio will use the name of the uploaded file as the name of your dataset.

Updating a Dataset
Any dataset that has previously been uploaded into Studio can be updated with new values. This may occur when new data becomes available or when existing values need to be corrected. Updates are always based on the Timestamp column, ensuring that new or modified records are correctly aligned with your existing dataset.
How to Update a Dataset
Update Dataset
In the Datasets table, click the update icon for the dataset you wish to modify. This opens the Update Dataset popup.
Upload New Dataset
Updating a dataset follows a process similar to adding a new dataset:
-
Upload a supported file containing the updated data.
-
The system validates the file to ensure it matches the existing dataset structure.
-
Once validated, the new data can be added to the dataset.
When adding new data, only timestamps after the end of the previously uploaded dataset are appended. This means the uploaded file can have a different filename or start immediately after the last timestamp in the existing dataset - it will still be successfully integrated. The illustration below demonstrates how new data is merged with an existing dataset:
Deleting a Dataset
Deleting a dataset is easy within Studio. Each row of the datasets table has a column with a delete button. When clicked, a pop-up will appear informing you of what dataset will be deleted. Here you can either confirm the deletion or cancel it.
