How to Bulk Create Point Count Observations

Go to the Bulk Uploader (https://data.pointblue.org/science/projectmanager/bulk_uploader), select your project (you will only see projects that you have access to).

The Add Observations module will load both sampling events and observations into your project. Bulk uploading observations requires 2 files: a CSV file with observation data, and a YAML configuration file describing how the CSV file columns map into your project database. CSV and YAML templates are automatically generated, based on your protocol selections, and three different formats are available to common data structures.

The CSV file you will load should have one observation type per line (see the three format types accepted below for specifics). Observations are organized into events as part of this process -- all observations seen at a single Sampling Unit at a given date and time under a Protocol is a sampling event. All event information, such as date and Sampling Unit name, should be repeated for each observation record in the csv file (aka "flattened" or denormalized).

To use the tool:

  1. Choose the Add Observations tab.

     

  2. If this is your first time using this tool to upload data of a particular type and protocol

    1. Select “Get a blank Template to fill out” and “Point Count”, then select a Protocol and Site Condition Protocol. Note, the Protocols must already exist in the Project before they will show up in the list. If you need a new protocol added, please submit a request to support@pointbluehelp.zendesk.com.

       

    2. Select a layout style for your observation data: Standard, By Distance Bins, or By Time Bins. See XXX for graphics that demonstrate the different format choices using the same dataset. All three format options result in the same data presentation in Biologists.

      1. Standard is formatted as one record per unique combination of species, detection cue, distance bin, time bin, breeding status, and singing indicator.

      2. Distance Bins is formatted as one record per unique combination of species, detection cue, time bin, breeding status, and singing indicator. Counts for each distance bin in the protocol are entered as separate fields in the same record.

      3. Time Bins is formatted as one record per unique combination of species, detection cue, distance bin, breeding status, and singing indicator. Counts for each time bin in the protocol are entered as separate fields in the same record.

      4. NOTE: Site condition protocol data are formatted and entered the same regardless of what layout style is selected for the observations.

    3. Download CSV and YAML files. These files are custom-generated based on the Protocol and Site Condition Protocol you selected above.

  3. Populate a CSV file with your observation data in one of the three accepted formats. The top row of headers in the template defines what should be in each column. The second row of headers is required for the YAML to process the file. If you wish to change the name of a header, you will need to change the corresponding line in the YAML file. However, in most cases, the YAML file should not need to be modified.
    TODO: Create documentation for YAML modification.




    Note 1: All entries must match the field definitions defined by the protocol. See Protocol Search for valid codes to use for the protocol-specific fields. For example, for the following protocol definition (VCP25_150), the distance bin values you must enter in your csv file are L25, L50, L75, G00, B00, B20, B50, or FLO.

    Note 2: The field No Animals Observed will override ALL observations if set to “y” or “x”. The observation data will not be loaded, but site condition data will load.

    Note 3: The tool will accept explicit zeros. However, the preferred use is to only populate a count for observed species. In the general surveys (i.e., not species targeted), the absence of a species is up for interpretation by the researcher as an absence. It could be that species A was expected to be there and was not detected. Or it could be that species A was expected to be there but the survey methodology would not detect it, or it could be that species A was not expected to be there. That is, in order for a 0 to be valid, there are two strong assumptions that must be valid: the species must be there and available for detection, and the survey methodology must permit the detection. Users are responsible for deciding what that means in their project – how should they let a researcher in 50 years from now understand what a zero means.

     

  4. Upload your data by

    1. Selecting “Bulk upload datafile.”

    2. Selecting your populated CSV file to upload.

    3. Selecting the YAML file that corresponds to your CSV file.

    4. Selecting “Process” to upload the data.

       

  5. Results will be displayed. Review these carefully to make sure both site conditions and observations loaded. You can also go into Biologists to confirm the data uploaded and is correct.


    If there is a problem with your data preventing it from loading, you will see an error message detailing the issue and what line it first occurs on.


    Note: If your observation count shows 0 in Biologists or in the Results screen of Bulk Uploader, but the site conditions loaded, verify that the No Animals Observed field is set correctly; this field will override all other observational data and cause nothing to load. If you expected the observations to load, delete the sampling event with the partial dataset from Biologists, correct the No Animals Observed field, and then upload files again. It can also be helpful to test just the observation protocol data without site conditions.

Data format examples:

  1. Standard format example

2. Distance Bin format example

3. Time Bin format example

4. Results in Biologists (note, date was changed in these examples to allow loading all three formats)