Data Preparation and Import

To analyze data in MIDAS, you need to load a data file. This page explains supported file formats, data types, and measurement scales.

Supported File Formats

MIDAS supports the following text-based data file formats:

CSV (Comma-Separated Values) The most common data format. Columns are separated by commas (,). File extension is typically .csv.

TSV (Tab-Separated Values) A file format where columns are separated by tab characters. File extension is typically .tsv or .txt.

Character Encoding UTF-8 encoding is supported. When saving CSV from Excel, select "CSV UTF-8 (Comma delimited)" format.

File Structure

MIDAS assumes data files have the following structure:

  • Row 1: Column names (header row)
  • Row 2 onwards: Data rows

Example:

Name,Age,Country
Alice,25,USA
Bob,30,Japan
Charlie,28,UK

Data Types

MIDAS automatically determines data types when loading. The following data types are supported:

boolean Boolean values represented by true/false, 1/0, yes/no, etc.

int64 (integer) Numbers without decimal points (e.g., 1, 42, -10).

float64 (floating point) Numbers with decimal points (e.g., 3.14, 0.5, -2.71).

date Date data (e.g., 2025-11-17, 2025/11/17).

datetime Data including both date and time (e.g., 2025-11-17 14:30:00).

timespan Time of day data (e.g., 14:30:00, 09:15).

duration Duration data (e.g., 1h 30m, 2d 3h).

string Text data that does not match any of the above types.

Data types are displayed in parentheses in column headers (e.g., Age (int64)). If a data type is not correctly determined, right-click the column in the data table and execute type conversion from "Convert Column Type".

Measurement Scales

Columns are automatically assigned a statistical "measurement scale". Measurement scales are determined based on data types, but may need to be changed according to the actual meaning of the data. Measurement scales indicate what kind of statistical processing is appropriate for the data.

Nominal Scale Data representing categories with no meaningful order.

Examples: Gender (male/female), colors (red/blue/green), country names

Ordinal Scale Data representing categories with meaningful order.

Examples: Satisfaction (low/medium/high), grade level (1st/2nd/3rd year), grades (A/B/C/D)

Interval Scale Equally spaced numeric data where differences between values are meaningful. However, "how many times" operations are not meaningful.

Examples: Temperature (Celsius), year (AD)

  • The difference between 20 and 10 degrees is meaningfully 10 degrees
  • However, 20 degrees is not "twice as warm" as 10 degrees

Ratio Scale Equally spaced numeric data where both differences and "how many times" operations are meaningful.

Examples: Height, weight, price, age

  • The difference between 20kg and 10kg is meaningfully 10kg
  • Furthermore, 20kg is "twice as heavy" as 10kg

Measurement scales affect graph type selection and statistical analysis. You can change measurement scales by right-clicking columns in the data table as needed.

Common Issues and Solutions

Character Encoding Issues

The file's character encoding may not be UTF-8. Re-save from Excel using "CSV UTF-8 (Comma delimited)" format.

Dates Not Recognized Correctly

The date format may not be a common format (like YYYY-MM-DD). Change the date column format in Excel, or load as string and convert afterward.

Want to Load Excel Files

MIDAS cannot directly load Excel files (.xlsx). In Excel, use "Save As" and select "CSV UTF-8 (Comma delimited)" format, then load that file.

Related Pages