You are currently viewing Importing historic tools information into AWS IoT SiteWise

Importing historic tools information into AWS IoT SiteWise


AWS IoT SiteWise is a managed service that helps prospects gather, retailer, manage and monitor information from their industrial tools at scale. Prospects typically have to deliver their historic tools measurement information from present methods equivalent to information historians and time sequence databases into AWS IoT SiteWise for guaranteeing information continuity, coaching synthetic intelligence (AI) & machine studying (ML) fashions that may predict tools failures, and deriving actionable insights.

On this weblog put up, we’ll present how one can get began with the BulkImportJob API and import historic tools information into AWS IoT SiteWise utilizing a code pattern.

You should use this imported information to achieve insights via AWS IoT SiteWise Monitor and Amazon Managed Grafana, practice ML fashions on Amazon Lookout for Gear and Amazon SageMaker, and energy analytical purposes.

To start a bulk import, prospects have to add a CSV file to Amazon Easy Storage Service (Amazon S3) containing their historic information in a predefined format. After importing the CSV file, prospects can provoke the asynchronous import to AWS IoT SiteWise utilizing the CreateBulkImportJob operation, and monitor the progress utilizing the DescribeBulkImportJob and ListBulkImportJob operations.


To observe via this weblog put up, you will want an AWS account and an AWS IoT SiteWise supported area. In case you are already utilizing AWS IoT SiteWise, select a unique area for an remoted setting. You’re additionally anticipated to have some familiarity with Python.

Setup the setting

  1. Create an AWS Cloud9 setting utilizing Amazon Linux 2 platform
  2. Utilizing the terminal in your Cloud9 setting, set up Git and clone the sitewise-bulk-import-example repository from Github
    sudo yum set up git
    git clone
    cd aws-iot-sitewise-bulk-import-example
    pip3 set up -r necessities.txt


For the demonstration on this put up, we’ll use an AWS Cloud9 occasion to characterize an on-premises developer workstation and simulate two months of historic information for a couple of manufacturing traces in an vehicle manufacturing facility.

We are going to then put together the information and import it into AWS IoT SiteWise at scale, leveraging a number of bulk import jobs. Lastly, we’ll confirm whether or not the information was imported efficiently.

AWS IoT SiteWise BulkImportJob Architecture

A bulk import job can import information into the 2 storage tiers provided by AWS IoT SiteWise, relying on how the storage is configured. Earlier than we proceed, allow us to first outline these two storage tiers.

Scorching tier: Shops steadily accessed information with decrease write-to-read latency. This makes the recent tier supreme for operational dashboards, alarm administration methods, and every other purposes that require quick entry to the latest measurement values from tools.

Chilly tier: Shops less-frequently accessed information with greater learn latency, making it supreme for purposes that require entry to historic information. As an example, it may be utilized in enterprise intelligence (BI) dashboards, synthetic intelligence (AI), and machine studying (ML) coaching. To retailer information within the chilly tier, AWS IoT SiteWise makes use of an S3 bucket within the buyer’s account.

Retention Interval: Determines how lengthy your information is saved within the sizzling tier earlier than it’s deleted.

Now that we realized concerning the storage tiers, allow us to perceive how a bulk import job handles writes for various situations. Seek advice from the desk beneath:

Worth Timestamp Write Conduct
New New A brand new information level is created
New Current Current information level is up to date with the brand new worth for the offered timestamp
Current Current The import job identifies duplicate information and discards it. No adjustments are made to present information.

Within the subsequent part, we’ll observe step-by-step directions to import historic tools information into AWS IoT SiteWise.

Steps to import historic information

Step 1: Create a pattern asset hierarchy

For the aim of this demonstration, we’ll create a pattern asset hierarchy for a fictitious vehicle producer with operations throughout 4 completely different cities. In a real-world state of affairs, chances are you’ll have already got an present asset hierarchy in AWS IoT SiteWise, through which case this step is elective.

Step 1.1: Evaluation the configuration

  1. From terminal, navigate to the basis of the Git repo.
  2. Evaluation the configuration for asset fashions and property.
    cat config/assets_models.yml
  3. Evaluation the schema for asset properties.
    cat schema/sample_stamping_press_properties.json

Step 1.2: Create asset fashions and property

  1. Run python3 src/ to robotically create asset fashions, hierarchy definitions, property, asset associations.
  2. Within the AWS Console, navigate to AWS IoT SiteWise, and confirm the newly created Fashions and Property.
  3. Confirm that you simply see the asset hierarchy just like the one beneath.Sample SiteWise Asset Hierarchy

Step 2: Put together historic information

Step 2.1: Simulate historic information

On this step, for demonstration goal, we’ll simulate two months of historic information for 4 stamping presses throughout two manufacturing traces. In a real-world state of affairs, this information would usually come from supply methods equivalent to information historians and time sequence databases.

The CreateBulkImportJob API has the next key necessities:

  • To establish an asset property, you will want to specify both an ASSET_ID + PROPERTY_ID mixture or the ALIAS.On this weblog, we will likely be utilizing the previous.
  • The info must be in CSV format.

Observe the steps beneath to generate information in response to these expectations. For extra particulars concerning the schema, check with Ingesting information utilizing the CreateBulkImportJob API.

  1. Evaluation the configuration for information simulation.
    cat config/data_simulation.yml
  2. Run python3 src/ to generate simulated historic information for the chosen properties and time interval. If the overall rows exceed rows_per_job as configured in bulk_import.yml, a number of information information will likely be created to assist parallel processing. On this pattern, about 700,000+ information factors are simulated for the 4 stamping presses (A-D) throughout two manufacturing traces (Sample_Line 1 and Sample_Line 2). Since we configured rows_per_job as 20,000, a complete of 36 information information will likely be created.
  3. Confirm the generated information information underneath information listing.SiteWise historical CSV data files
  4. The info schema will observe the column_names configured in bulk_import.yml config file.

Step 2.2: Add historic information to Amazon S3

As AWS IoT SiteWise requires the historic information to be accessible in Amazon S3, we’ll add the simulated information to the chosen S3 bucket.

  1. Replace the information bucket underneath bulk_import.yml with any present momentary S3 bucket that may be deleted later.
  2. Run python3 src/ to add the simulated historic information to the configured S3 bucket.
  3. Navigate to Amazon S3 and confirm the objects have been uploaded efficiently.SiteWise Bulkimport historical data in S3

Step 3: Import historic information into AWS IoT SiteWise

Earlier than you possibly can import historic information, AWS IoT SiteWise requires that you simply allow Chilly tier storage. For added particulars, check with Configuring storage settings.

In case you have already activated chilly tier storage, think about modifying the S3 bucket to a brief one which will be later deleted whereas cleansing up the pattern assets.

Observe that by altering the S3 bucket, not one of the information from present chilly tier S3 bucket is copied to the brand new bucket. When modifying S3 bucket location, make sure the IAM function configured underneath S3 entry function has permissions to entry the brand new S3 bucket.

Step 3.1: Configure storage settings

  1. Navigate to AWS IoT SiteWise, choose Storage, then choose Activate chilly tier storage.
  2. Choose an S3 bucket location of your alternative.AWS IoT SiteWise Edit Storage
  3. Choose Create a task from an AWS managed template.
  4. Examine Activate retention interval, enter 30 days, and save.AWS IoT SiteWise Hot Tier Settings

Step 3.2: Present permissions for AWS IoT SiteWise to learn information from Amazon S3

  1. Navigate to AWS IAM, choose Insurance policies underneath Entry administration, and Create coverage.
  2. Swap to JSON tab and change the content material with the next. Replace <bucket-name> with the title of information S3 bucket configured in bulk_import.yml.
      "Model": "2012-10-17",
      "Assertion": [
          "Effect": "Allow",
          "Action": [
          "Useful resource": ["arn:aws:s3:::<bucket-name>"]
  3. Save the coverage with Identify as SiteWiseBulkImportPolicy.
  4. Choose Roles underneath Entry administration, and Create function.
  5. Choose Customized belief coverage and change the content material with the next.
      "Model": "2012-10-17",
      "Assertion": [
          "Sid": "",
          "Effect": "Allow",
          "Principal": {
            "Service": ""
        "Action": "sts:AssumeRole"
  6. Click on Subsequent and choose the SiteWiseBulkImportPolicy IAM coverage created within the earlier steps.
  7. Click on Subsequent and create the function with Function title as SiteWiseBulkImportRole.
  8. Choose Roles underneath Entry administration, seek for the newly created IAM function SiteWiseBulkImportRole, and click on on its title.
  9. Copy the ARN of the IAM function utilizing the copy icon.

Step 3.3: Create AWS IoT SiteWise bulk import jobs

  1. Substitute the role_arn area in config/bulk_import.yml with the ARN of SiteWiseBulkImportRole IAM function copied in earlier steps.
  2. Replace the config/bulk_import.yml file:
    • Substitute the role_arn with the ARN of SiteWiseBulkImportRole IAM function.
    • Substitute the error_bucket with any present momentary S3 bucket that may be deleted later.
  3. Run python3 src/ to import historic information from the S3 bucket into AWS IoT SiteWise:
  4. The script will create a number of jobs to concurrently import all the information information created into AWS IoT SiteWise. In a real-world state of affairs, a number of terabytes of information will be rapidly imported into AWS IoT SiteWise utilizing concurrently operating jobs.
  5. Examine the standing of jobs from the output:
    Whole S3 objects: 36
    Variety of bulk import jobs to create: 36
            Created job: 03e75fb2-1275-487f-a011-5ae6717e0c2e for importing information from information/historical_data_1.csv S3 object
            Created job: 7938c0d2-f177-4979-8959-2536b46f91b3 for importing information from information/historical_data_10.csv S3 object
    Checking job standing each 5 secs till completion.
            Job id: 03e75fb2-1275-487f-a011-5ae6717e0c2e, standing: COMPLETED
            Job id: 7938c0d2-f177-4979-8959-2536b46f91b3, standing: COMPLETED

  6. When you see the standing of any job as COMPLETED_WITH_FAILURES or FAILED, check with Troubleshoot frequent points part.

Step 4: Confirm the imported information

As soon as the majority import jobs are accomplished, we have to confirm if the historic information is efficiently imported into AWS IoT SiteWise. You’ll be able to confirm the information both by immediately wanting on the chilly tier storage or by visually inspecting the charts accessible in AWS IoT SiteWise Monitor.

Step 4.1: Utilizing the chilly tier storage

On this step, we’ll examine if new S3 objects have been created within the bucket that was configured for chilly tier.

  1. Navigate to Amazon S3 and find the S3 bucket configured underneath AWS IoT SiteWise → StorageS3 bucket location (in Step 3) for chilly tier storage.
  2. Confirm the partitions and objects underneath the uncooked/ prefix. AWS IoT SiteWise Cold Tier files

Step 4.2: Utilizing AWS IoT SiteWise Monitor

On this step, we’ll visually examine if the charts present information for the imported date vary.

  1. Navigate to AWS IoT SiteWise and find Monitor.
  2. Create a portal to entry information saved in AWS IoT SiteWise.
    • Present AnyCompany Motor because the Portal title.
    • Select IAM for Consumer authentication.
    • Present your e-mail tackle for Help contact e-mail, and click on Subsequent.
    • Go away the default configuration for Extra options, and click on Create.
    • Underneath Invite directors, choose your IAM consumer or IAM Function, and click on Subsequent.
    • Click on on Assign Customers.
  3. Navigate to Portals and open the newly created portal.
  4. Navigate to Property and choose an asset, for instance, AnyCompany_MotorSample_ArlingtonSample_StampingSample_Line 1Sample_Stamping Press A.
  5. Use Customized vary to match the date vary for the information uploaded.
  6. Confirm the information rendered within the time sequence line chart.SiteWise Monitor Example

Troubleshoot frequent points

On this part, we’ll cowl the frequent points encountered whereas importing information utilizing bulk import jobs and spotlight some potential causes.

If a bulk import job isn’t efficiently accomplished, it’s best observe to check with logs within the error S3 bucket configured in bulk_import.yml and perceive the basis trigger.SiteWise BulkImportJob Error Bucket

No information imported

  • Incorrect schema: dataType doesn't match dataType tied to the asset-property
    The schema offered at Ingesting information utilizing the CreateBulkImportJob API ought to be adopted precisely. Utilizing the console, confirm the offered DATA_TYPE offered matches with the information sort within the corresponding asset mannequin property.
  • Incorrect ASSET_ID or PROPERTY_ID: Entry isn't modeled
    Utilizing the console, confirm the corresponding asset and property exists.
  • Duplicate information: A price for this timestamp already exists
    AWS IoT SiteWise detects and robotically discards any duplicate. Utilizing console, confirm if the information already exists.

Lacking solely sure components of information

  • Lacking latest information: BulkImportJob API imports the latest information (that falls throughout the sizzling tier retention interval) into AWS IoT SiteWise sizzling tier and doesn’t switch it instantly to Amazon S3 (chilly tier). Chances are you’ll want to attend for the subsequent sizzling to chilly tier switch cycle, which is at present set to six hours.

Clear Up

To keep away from any recurring prices, take away the assets created on this weblog. Observe the steps to delete these assets:

  1. Navigate to AWS Cloud9 and delete your setting.
  2. Run python3 src/ to delete the next assets, so as, from AWS IoT SiteWise:
    • Asset associations
    • Property
    • Hierarchy definitions from asset fashions
    • Asset fashions
  3. From AWS IoT SiteWise console, navigate to MonitorPortals, choose the beforehand created portal, and delete.
  4. Navigate to Amazon S3 and carry out the next:
    • Delete the S3 bucket location configured underneath the Storage part of AWS IoT SiteWise
    • Delete the information and error buckets configured within the /config/bulk_import.yml of Git repo


On this put up, you’ve gotten realized the right way to use the AWS IoT SiteWise BulkImportJob API to import historic tools information into AWS IoT SiteWise utilizing AWS Python SDK (Boto3). You may also use the AWS CLI or SDKs for different programming languages to carry out the identical operation. To study extra about all supported ingestion mechanisms for AWS IoT SiteWise, go to the documentation.

In regards to the authors

Raju Gottumukkala is an IoT Specialist Options Architect at AWS, serving to industrial producers of their sensible manufacturing journey. Raju has helped main enterprises throughout the power, life sciences, and automotive industries enhance operational effectivity and income development by unlocking true potential of IoT information. Previous to AWS, he labored for Siemens and co-founded dDriven, an Business 4.0 Knowledge Platform firm.
Avik Ghosh is a Senior Product Supervisor on the AWS Industrial IoT crew, specializing in the AWS IoT SiteWise service. With over 18 years of expertise in know-how innovation and product supply, he makes a speciality of Industrial IoT, MES, Historian, and large-scale Business 4.0 options. Avik contributes to the conceptualization, analysis, definition, and validation of Amazon IoT service choices.

Leave a Reply