Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PC-1966][PXCT-96] AWS S3 Data Exporter Documentation #2397

Open
wants to merge 23 commits into
base: main
Choose a base branch
from

Conversation

ArduinoBot
Copy link
Collaborator

What This PR Changes

Documentation addition based on Arduino AWS S3 CSV Exporter by Marco Colombo

Contribution Guidelines

@ArduinoBot ArduinoBot added Tutorial arduino Bugs and fixes added by the Arduino Team labels Feb 13, 2025
Copy link

github-actions bot commented Feb 13, 2025

Preview Deployment

🚀 Preview this PR: https://67ad8856ab21fce469a13ec0--docs-content.netlify.app
📍 Commit SHA: c02ee4f

@TaddyHC TaddyHC requested a review from canchebagur February 13, 2025 21:08
Copy link
Contributor

@canchebagur canchebagur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review is ready, @TaddyHC. Good job here. Almost all that I left you are suggestions for improving the writing.


## Overview

The **Arduino AWS S3 CSV Exporter** is designed to extract time series data from **Arduino Cloud** and publish it to an **AWS S3** bucket in CSV format.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The **Arduino AWS S3 CSV Exporter** is designed to extract time series data from **Arduino Cloud** and publish it to an **AWS S3** bucket in CSV format.
The Arduino AWS S3 CSV Exporter is designed to extract time series data from Arduino Cloud and publish it in CSV format to an AWS S3 bucket.


The **Arduino AWS S3 CSV Exporter** is designed to extract time series data from **Arduino Cloud** and publish it to an **AWS S3** bucket in CSV format.

A scheduled AWS Lambda function manages the data extraction process, running at configurable intervals. The extraction frequency, sampling resolution and filters can be customized to refine the data stored in S3.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A scheduled AWS Lambda function manages the data extraction process, running at configurable intervals. The extraction frequency, sampling resolution and filters can be customized to refine the data stored in S3.
A scheduled AWS Lambda function runs the data extraction process at configurable intervals. The extraction frequency, sampling resolution and filters can be customized to refine the data stored in S3.


![Arduino AWS S3 CSV Exporter Build](assets/cloudformation_stack_creation.gif)

At the end of this tutorial, the stack will be configured to extract data from Arduino Cloud every hour, aggregate samples at a five minute resolution and store structured CSV files in AWS S3. The setup will also allow filtering by tags to include only specific data, providing a scalable and structured approach to managing cloud connected device data and ensuring easy retrieval and long term storage.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
At the end of this tutorial, the stack will be configured to extract data from Arduino Cloud every hour, aggregate samples at a five minute resolution and store structured CSV files in AWS S3. The setup will also allow filtering by tags to include only specific data, providing a scalable and structured approach to managing cloud connected device data and ensuring easy retrieval and long term storage.
At the end of this tutorial, the stack will be configured to extract data from Arduino Cloud every hour, aggregate samples at a five-minute resolution, and store structured CSV files in AWS S3. The setup will also allow filtering by tags to include only specific data, providing a scalable and structured approach to managing cloud-connected device data and ensuring easy retrieval and long-term storage.


## Required Software

* [Arduino Cloud](https://cloud.arduino.cc/). **If you do not have an account, you can create one for free inside [cloud.arduino.cc](https://cloud.arduino.cc/home/?get-started=true)**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* [Arduino Cloud](https://cloud.arduino.cc/). **If you do not have an account, you can create one for free inside [cloud.arduino.cc](https://cloud.arduino.cc/home/?get-started=true)**.
* [Arduino Cloud](https://cloud.arduino.cc/). **If you do not have an account, you can create one for free in [cloud.arduino.cc](https://cloud.arduino.cc/home/?get-started=true)**.


Each function execution retrieves data from the selected **Arduino Things** and generates a CSV file. The file is then uploaded to **S3** for structured storage and accessibility.

Data is extracted every hour by default, with samples aggregated at a 5 minute resolution. Both the extraction period and the aggregation rate are configurable. Aggregation is performed by calculating the average over the aggregation period, while non-numeric values, such as strings, are sampled at the specified resolution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Data is extracted every hour by default, with samples aggregated at a 5 minute resolution. Both the extraction period and the aggregation rate are configurable. Aggregation is performed by calculating the average over the aggregation period, while non-numeric values, such as strings, are sampled at the specified resolution.
Data is extracted every hour by default, with samples aggregated at a 5-minute resolution. Both the extraction period and the aggregation rate are configurable. Aggregation is performed by calculating the average over the aggregation period, while non-numeric values, such as strings, are sampled at the specified resolution.


![S3 Bucket date defined organization](assets/lambda_function_cloudwatch_logs.png)

Detailed logs display function specific messages, showing configuration settings such as applied filters, aggregation parameters and time window alignment. Logs also corroborate successful data exports, including file upload status, highlighting any warnings or errors encountered during execution. This helps us verify if it could establish communication with configured Arduino keys.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Detailed logs display function specific messages, showing configuration settings such as applied filters, aggregation parameters and time window alignment. Logs also corroborate successful data exports, including file upload status, highlighting any warnings or errors encountered during execution. This helps us verify if it could establish communication with configured Arduino keys.
Detailed logs display function-specific messages and show configuration settings such as applied filters, aggregation parameters and time window alignment. Logs also corroborate successful data exports, including file upload status, highlighting any warnings or errors encountered during execution. This helps us verify whether it could establish communication with configured Arduino keys.


### EventBridge

**Amazon EventBridge** manages the scheduling of Lambda function executions. It makes sure that the data extraction process runs at predefined intervals without manual intervention.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**Amazon EventBridge** manages the scheduling of Lambda function executions. It makes sure that the data extraction process runs at predefined intervals without manual intervention.
**Amazon EventBridge** manages the scheduling of Lambda function executions. It ensures that the data extraction process runs at predefined intervals without manual intervention.


**Amazon EventBridge** manages the scheduling of Lambda function executions. It makes sure that the data extraction process runs at predefined intervals without manual intervention.

The **EventBridge Rules** dashboard shows the rule responsible for triggering the *AWS S3 CSV Exporter Lambda function*. The rule type is **Scheduled Standard**, meaning it executes the function at fixed intervals, with its status appearing as Enabled, indicating that it is active and operational.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The **EventBridge Rules** dashboard shows the rule responsible for triggering the *AWS S3 CSV Exporter Lambda function*. The rule type is **Scheduled Standard**, meaning it executes the function at fixed intervals, with its status appearing as Enabled, indicating that it is active and operational.
The **EventBridge Rules** dashboard shows the rule responsible for triggering the **AWS S3 CSV Exporter Lambda function.** The rule type is **Scheduled Standard**, meaning it executes the function at fixed intervals. Its status appears as Enabled, indicating that it is active and operational.


![S3 Bucket date defined organization](assets/lambda_function_eventbridge_trigger.png)

The combination of Lambda, CloudWatch and EventBridge provides monitoring and maintenance of the AWS S3 CSV Exporter. Lambda handles function execution and triggers, CloudWatch logs real time function activity and performance metrics. At the same time, EventBridge schedules the execution process to maintain continuous data exports.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The combination of Lambda, CloudWatch and EventBridge provides monitoring and maintenance of the AWS S3 CSV Exporter. Lambda handles function execution and triggers, CloudWatch logs real time function activity and performance metrics. At the same time, EventBridge schedules the execution process to maintain continuous data exports.
The combination of Lambda, CloudWatch, and EventBridge monitors and maintains the AWS S3 CSV Exporter. Lambda handles function execution and triggers, while CloudWatch logs real-time function activity and performance metrics. At the same time, EventBridge schedules the execution process to maintain continuous data exports.


This tutorial showed how to use the **Arduino AWS S3 CSV Exporter** to extract time series data from **Arduino Cloud** and store it in **AWS S3** for structured management and analysis. The exporter can be adapted to different use cases with configurable settings for aggregation intervals, tag-based filtering and optional data compression.

By deploying the exporter using a CloudFormation template, you have simplified cloud based data storage for IoT applications. This setup automates data collection, simplifying trend analysis, device monitoring and long-term storage management.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
By deploying the exporter using a CloudFormation template, you have simplified cloud based data storage for IoT applications. This setup automates data collection, simplifying trend analysis, device monitoring and long-term storage management.
By deploying the exporter using a CloudFormation template, you have simplified cloud-based data storage for IoT applications. This setup automates data collection, simplifying trend analysis, device monitoring and long-term storage management.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arduino Bugs and fixes added by the Arduino Team preview Tutorial
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants