This dataset contains vibration data recorded on a rotating drive train. This drive train consists of an electronically commutated DC motor and a shaft driven by it, which passes through a roller bearing. With the help of a 3D-printed holder, unbalances with different weights and different radii were attached to the shaft. Besides the strength of the unbalances, the rotation speed of the motor was also varied. This dataset can be used to develop and test algorithms for the automatic detection of unbalances on drive trains. Datasets for 4 differently sized unbalances and for the unbalance-free case were recorded. The vibration data was recorded at a sampling rate of 4096 values per second. Datasets for development (ID "D[0-4]") as well as for evaluation (ID "E[0-4]") are available for each unbalance strength. The rotation speed was varied between approx. 630 and 2330 RPM in the development datasets and between approx. 1060 and 1900 RPM in the evaluation datasets. For each measurement of
1 PAPER • NO BENCHMARKS YET
Click to add a brief description of the dataset (Markdown and LaTeX enabled).
VLUC (Video-Like Urban Computing) is a benchmark for video-like computing on citywide traffic density and crowd prediction. It consists of two new datasets BousaiTYO and BousaiOSA and existing datasets TaxiBJ, BikeNYC I-II, and TaxiNYC.
The dataset contains traffic traces collected from 3 different VR applications. Researchers can use this dataset to replicate the behavior of real VR traffic directly in their studies, e.g., their simulations. Further information can be found in the repository.
Context of the data sets The Zooniverse platform (www.zooniverse.org) has successfully built a large community of volunteers contributing to citizen science projects. Galaxy Zoo and the Milky Way Project were hosted there.
Recorded with a Husky A200 wheeled UGV, the Vulpi 2021 dataset contains 13 min of Inertial Measurement Unit (IMU), motor current, and wheel odometry data, focusing on agricultural terrains. The dataset includes experiments on concrete, a dirt road, a ploughed terrain and an unploughed terrain that were all recorded on an experimental farm in San Cassiano, Lecce, Italy.
The dataset is a private dataset collected for automatic analysis of psychological distress. It contains self-reported distress labels provided by human volunteers. The dataset consists of 30-min interview recordings of participants.
1 PAPER • 2 BENCHMARKS
This file contains the data and code for the publication "The Federal Reserve's Response to the Global Financial Crisis and Its Long-Term Impact: An Interrupted Time-Series Natural Experimental Analysis" by A. C. Kamkoum, 2023.
This dataset provides wireless measurements from two industrial testbeds: iV2V (industrial Vehicle-to-Vehicle) and iV2I+ (industrial Vehicular-to-Infrastructure plus sensor).
The uniD dataset is an innovative collection of naturalistic road user trajectories, captured within the RWTH Aachen University campus using drone technology to address common challenges such as occlusions found in traditional traffic data collection methods. It meticulously documents the movement and classifies each road user by type. Employing cutting-edge computer vision algorithms, the dataset ensures high positional accuracy. Its utility spans various applications, from predicting road user behavior and modeling driver actions to conducting scenario-based safety checks for automated driving systems and facilitating the data-driven design of Highly Automated Driving (HAD) system components.
This dataset contains the ground truth for urban changes occurred in Mariupol, Ukraine for the time frame 2017-2020. This is useful for transferring the urban change monitoring network ERCNN-DRS (https://github.com/It4innovations/ERCNN-DRS_urban_change_monitoring) to that region.
Bach chorales is a univariate time series based on chorales, where the task is to learn generative grammar. The dataset consists of single-line melodies of 100 Bach chorales (originally 4 voices). The melody line can be studied independently of other voices. The grand challenge is to learn a generative grammar for stylistically valid chorales.
0 PAPER • NO BENCHMARKS YET
Objective This study introduces the BlendedICU dataset, a massive dataset of international intensive care data. This dataset aims to facilitate generalizability studies of machine learning models, as well as statistical studies of clinical practices in the intensive care units.
The dataset comprises patches of size 512x512 pixels collected from Sentinel-2 L2A satellite mission. All reported forest fires are located in California. For each area of interest, two images are provided: pre-fire acquisition and post-fire acquisition. Each image is composed of 12 different channels, collecting information from the visible spectrum, infrared and ultrablue.
A dataset with $23\,870$ digital trajectories (i.e. time series) of handwritten lower- and uppercase Latin letters and Arabic numbers ($a$-$z$, $A$-$Z$, $0$-$9$), generated by $77$ experts using a Wacom Pen Tablet. An expert is considered a proficient user of the recorded symbols, in this case adult native German speakers.
FHRMA is an open-source project for Fetal Heart Rate Morphological Analysis containing Matlab source code and datasets. As a sub-project, it includes a deep learning method and dataset for automatic identification of the maternal heart rate (MHR) and, more generally, false signals (FSs) on fetal heart rate (FHR) recordings. The challenge concerns particularly the FHR signal recorded with Doppler sensors, on which MHR interference and other FSs are particularly common, but the dataset also includes FHR recorded with scalp-ECG. The training and validation dataset contained 1030 expert-annotated periods (mean duration: 36 min) from 635 recordings. Labels consist of annotating each time sample as either 1: False signal; 0: True signal, or -1: do not know or irrelevant.
ForeDeCk is a time series database compiled at the National Technical University of Athens that contains 900,000 continuous time series, built from multiple, diverse and publicly accessible sources. ForeDeCk emphasizes business forecasting applications, including series from relevant domains such as industries, services, tourism, imports & exports, demographics, education, labor & wage, government, households, bonds, stocks, insurances, loans, real estate, transportation, and natural resources & environment.
The medaka (Oryzias latipes) and the zebrafish (Danio rerio) are used as a model organism for a variety of subjects in biomedical research. The presented work aims to study the potential of automated ventricular dimension estimation through heart segmentation in medaka. For more on this, it's time for a closer look on our paper and the supplementary materials.
Mudestreda Multimodal Device State Recognition Dataset obtained from real industrial milling device with Time Series and Image Data for Classification, Regression, Anomaly Detection, Remaining Useful Life (RUL) estimation, Signal Drift measurement, Zero Shot Flank Took Wear, and Feature Engineering purposes.
This study’s sample consists of seven corporations (Black Rock, Google, Meta, JP Morgan, Walgreens, Netflix, and Pepsico) analyzed across seven quarters beginning in 2021. The data includes the implied volatility level (annualized) for the day before, the day of, and the day following the earnings report. This information was obtained from the Bloomberg Terminal dataset BVOL. The data we read from the terminal is based on Bloomberg’s algorithm for calculating the implied volatility for different strikes. The value is the same for both calls and puts, which makes comparisons and calculations more straightforward. The dataset contains a mixture of high-growth, high-risk technology corporations that saw strong market tailwinds during the previous year and steady, high-dividend-paying equities. For a more comprehensive conclusion, we analyze the implied volatility levels across three expirations to determine the influence of each expiration. The shortest maturity spans from 1 to 4 days, wh
PJM Hourly Energy Consumption Data PJM Interconnection LLC (PJM) is a regional transmission organization (RTO) in the United States. It is part of the Eastern Interconnection grid operating an electric transmission system serving all or parts of Delaware, Illinois, Indiana, Kentucky, Maryland, Michigan, New Jersey, North Carolina, Ohio, Pennsylvania, Tennessee, Virginia, West Virginia, and the District of Columbia.
SynD is a synthetic energy dataset with a focus on residential buildings. This dataset is the result of a custom simulation process that relies on power traces of household appliances. The output of simulations is the power consumption of 21 household appliances as well as the household-wide consumption (i.e. mains). Therefore, SynD's can be used for Non-Intrusive Load Monitoring, also referred to as Energy Disaggregation.
Human activity recognition and clinical biomechanics are challenging problems in physical telerehabilitation medicine. However, most publicly available datasets on human body movements cannot be used to study both problems in an out-of-the-lab movement acquisition setting. The objective of the VIDIMU dataset is to pave the way towards affordable patient tracking solutions for remote daily life activities recognition and kinematic analysis.
X-Wines is a consistent wine dataset containing 100,646 instances and 21 million real evaluations carried out by users. Data were collected on the open Web in 2022 and pre-processed for wider free use. They refer to the scale 1–5 ratings carried out over a period of 10 years (2012–2021) for wines produced in 62 different countries.
The Tufts fNIRS to Mental Workload (fNIRS2MW) open-access dataset is a new dataset for building machine learning classifiers that can consume a short window (30 seconds) of multivariate fNIRS recordings and predict the mental workload intensity of the user during that window.