Zoltar User Guide

    This is draft documentation of how to use Zoltar.

    Table of contents

    Introduction

    The Zoltar forecast archive is a web application to develop ideas for a repository of model forecast results. Until now, predictions made by models have been stored in differing formats and locations. This complicates tracking, comparing, and revisiting forecasts. Zoltar supports storing, retrieving, comparing, and analyzing time series forecasts for prediction challenges of interest to the modeling community.

    This document is a first draft of a user guide covering Zoltar's main features.

    Note: Presently the site is under beta test. If you have questions about it or want an account, please contact Professor Nicholas Reich (nick@schoolph.umass.edu>), director of the Reich Lab.

    Zoltar Home page

    The home page lists all of the projects in the archive. A project is a collection of forecast models and their forecasts, and is described in more detail below. This page shows basic project information, including name, owner, description, and (in the "Objects" column) a summary of the number of models and forecasts in the project. (This number is an estimate.)

    At the page's top is a header that's shown on all pages. It contains three icons: Home page (the Zoltar crystal ball in the upper left), and on the right a user drop down menu and a help (question mark) icon on the far right. The drop down menu's appearance depends on whether a user is logged in. If so then it's text is the user name and the menu items are links to the user profile page, and a logout item. If the user is not logged in then "Sign in" is shown. Clicking it takes you to a typical login page where you specify your account's user name and password.

    Clicking a project name takes you to its detail page.

    Creating projects: Any logged in user can create projects via the "New" button towards the top. Clicking it will take you to a form where you can fill in the details described below.

    Project detail page

    A project is the main element for representing a forecasting challenge. It has an owner (a registered user in the system), and zero or more model owners (also users). A project owner can do anything to the project, including what model owners can do, but model owners are limited to creating, editing, and deleting Models, and uploading and deleting forecasts. (To become a model owner you must contact the project owner.)

    Each project's detail page is divided into four vertical sections, described next: Details table, forecast models, targets, and time zeros.

    Project details table

    At the top of the page is a table showing non-model information related to the project:

    Forecast models

    The Models section lists the project's forecast models by name, with links to model detail pages (see details below). If you're a project owner or model owner then a "New" button is shown that takes you to a form for creating a model.

    Locations

    This section lists names of the locations in the project. These were either defined by project owner when creating the project, or were created automatically when the project template was loaded and a referenced location was not found.

    Targets

    This section lists information about project forecast targets. Like locations, these were either created explicitly by the project owner, or automatically when the template was loaded. The information includes the following fields.

    (A note regarding auto-created targets. Auto-created targets require further, careful editing to be complete. In particular, users must fill in the description and unit fields, identify the date-related ones by checking Is date, and check Is step ahead and fill in the Step ahead increment integer value. Importantly: You will get unwanted results if Is date or Is step ahead is incorrect.)

    Time zeros

    This section details the project's time zeros. About time zeros and data version dates: Because the forecasting field does not have standard terminology, we have settled on the following two concepts for this application. Note that some time zeros are tagged as starting a season, specifying the season's name, which helps to segment the time zeros. Zoltar uses season information in the visualization and score pages, where the user can select which season to show data for. This also helps to keep performance up.

    Model detail page

    A model is the represention of code that generates forecasts. Clicking on a model link takes you to its detail page. The detail page is divided into two vertical sections, described next: Details table and forecasts list.

    Model details table

    At the top of the page is a table showing information related to the model: - Owner: The model's owner. The owner is the user that created the model (which is done on the home page), and she can edit or delete the model, and upload or delete its forecasts. - Project: A link to the project the model belongs to. - Description: Prose provided by model owner. It should include information on reproducing the model's results. - Home: A link to the model home page. - Auxiliary data: An optional link to model-specific data files that were used by the model beyond the project's core data. This is not used directly by Zoltar.

    Forecasts

    The Forecasts section lists the model's forecasts, with links to forecast detail pages (see details below). Each forecast is data associated with a particular time zero in the project. The data is loaded from a file with a specific format (described next). There is one line per time zero, specifying the time zero date (but not the data version date), the original forecast data filename that was uploaded, and an Actionbutton, either a green upload one if there is no data associated with the time zero, or a red delete button otherwise. Clicking on the filename link takes you to a forecast detail page that includes a data preview plus a Download button (currently CSV or JSON formats).

    About forecast file formats: We have adopted the data format used for CDC flu challenges. Details are at flu_challenge_2016-17_update.docx. Summary:

    Data analysis

    Zoltar provides two fledgling project analysis tools: visualizations and scores. You will find links to them on project detail pages on the Analysis line in the project details section at the top of the page.

    Visualizations

    This page uses the D3 Foresight component's TimeChart to display the project's models' forecasts for each step ahead target. Currently we only use the basic plot feature along with the actual component. We plan to add baseline, history, and other features. To use the chart: - Select a location and season from the drop downs. (Note that selecting a season will reload the page, which can take a long time for large projects.) - Click the plot area to move the current time zero, indicated by the vertical division between gray on the left and white on the right. The plot shows the step ahead target predictions for each model. - Hover over the plot to see details. - The legend shows model names, and allows showing or hiding them. Hover over a model name for details, and click the link button to show model details in a separate Zoltar page. - Click the left and right arrows to the right of the legend to move the current time zero. - Click the 'hamburger' icon to toggle the legend.

    Scores

    This page shows the mean absolute error values for all models in the project. (This can take a long time to load for large projects.) The location and season drop downs work like those on the visualization page. The smallest errors are shown in inverse text.

    For technical users

    API

    We provide the following REST endpoints. All results are JSON. The API is browsable from the root URI on the home page (look for API buttons), and is a great starting point for developers. Note that all projects and users are listed, but private projects, their models, and their forecasts, can only be accessed by authorized accounts.

    Endpoints:

    Forecast data format

    Forecast data can be downloaded in either CSV or JSON format. The CSV matches the input file CDC format, but the JSON is our own and is structured like this:

    {'metadata': project_or_forecast_details_object, 'data': data_object}

    Where project_or_forecast_details_object is an object that's the same as either the Project or Forecast detail object. data_object is structured like this:

    {location1: target_dict_1, location2: target_dict_2, ...}

    Each target_dict is of the form:

    {target1: {'unit': unit1, 'point': point_val1, 'bins': bin_list1},
     target2: {'unit': unit2, 'point': point_val2, 'bins': bin_list2},
     ...
    }

    and each bin_list is like:

    [[bin_start_incl1, bin_end_notincl1, value1],
     [bin_start_incl2, bin_end_notincl2, value2],
     ...
    ]