Introduction
Blecon is enabling a new class of physical data sources and sensors based on Bluetooth that allows developers and application builders to quickly integrate real-world data into their products and applications.
Sensors and other data sources that use Blecon can use nearby phones or gateways as hotspots, require zero configuration or coding to use, and deliver data as WebHooks; enabling them to work with any cloud, any web framework, and any language.
In this article, I will show you an example of how you can use Blecon and AWS to create a real-world streaming data ingestion pipeline in a few minutes, using Amazon Kinesis. Amazon Kinesis is a hyper-scale message broker service by AWS, which allows you to process and react to incoming data in real-time as well as route it to multiple consumers.
Once you have created a stream of events in Kinesis, it can be consumed by multiple consumers. In this example, I will show how to route it to an Amazon Redshift cluster. Amazon Redshift is a petabyte-scale SQL database typically used for data warehouse applications.
This architecture is suitable for an environment where you have many hundreds or thousands of Blecon sensors all regularly reporting real-world conditions.
This tutorial assumes intermediate-level knowledge of AWS.
Architecture overview
Let’s have a look at what we’ll build.
Because Blecon presents Bluetooth sensor readings as HTTP requests, we can use a Lambda function with Function URL to ingest data to the Kinesis stream. Once the data is in the stream we use a delivery stream to write it to Redshift.
After we have created a stream of sensor readings in Amazon Kinesis, you can then analyze and consume this stream in multiple ways.
Create Kinesis Data Stream
Let’s start with creating our Kinesis data stream. Kinesis is a big data streaming service designed for processing large-scale data streams. We're going to push sensor data into this stream.
To start, you’ll need an AWS account. If you’ve never used AWS before, this AWS getting started tutorial may be useful.
Note: I have used the us-east-1 region in my examples.
- Open the Kinesis Data Streams page in the AWS Console either using the search or the menu
- Click Create new stream
- Choose a name such as sensorstream
- Choose On-Demand
- Click Create
Create ingestion Lambda function
The ingestion lambda function will receive HTTP requests from your Blecon network, and insert sensor events onto your Kinesis stream, where they can be consumed later.
- From the AWS Console, click Lambda, and then Create function.
- In the Create Lambda screen:
- Choose Author from scratch.
- Choose a name for the function such as BleconToDynamo.
- Choose a runtime environment. For this example we will be using Python 3.9.
- Click Advanced Settings and then select Function URL with auth type ‘NONE’.
- Click Create function.
- In the editor window, paste the following code:
import json, boto3
|
Note: Make sure STREAM_NAME matches what you chose earlier.
What does this do? It extracts the sensor events from the Blecon request and puts a subset of them onto the Kinesis stream. Specifically, it un-batches the events and sends a simplified event to the stream containing only the sensor value, the device ID, and the timestamp.
Your application will vary but keep in mind that Blecon devices batch sensor readings, so you will usually want to un-batch them for further processing.
Set permissions on Lambda Function
Your Lambda function needs permission to post onto the Kinesis stream you created.
- Open the function in the AWS Console and go to Configuration, the Permissions
- Click on the role name to open it in IAM
- Click Add Permissions, then Attach Policy
- Search for AmazonKinesisFullAccess and tick the checkbox
- Click Attach policies at the bottom
Create the Redshift Cluster
- Open Amazon Redshift from the AWS Console menu
- Click Create Cluster
- Choose a free trial cluster. Note that this provides you 1 free month of Redshift cluster usage, and you will need to delete or pause the cluster after this time to avoid incurring charges.
- Set an admin password and make a note of this
Create Redshift Table
Once your Redshift cluster has been created, we need to create a table to store your data.
- Open the Redshift Console, and then the cluster you created
- Click Query data at the top and choose Query editor v2
- In the left menu, choose Create then Table
- Set the database as the default database called dev
- Set the table name to sensor-table and schema Public
- Create table schema in the same screen, and enter:
-
device_id type varchar
-
sensor_value type integer
-
timestamp type varchar
- Click Create table
Create the Kinesis Delivery Stream
A Kinesis Delivery Stream batches up records and saves them to a location such as S3 or Redshift.
We need to use this to write our data to Redshift.
- Open Kinesis in the AWS Console
- Go to Delivery Streams and click Create delivery stream
As Source, choose Amazon Kinesis Data Streams
As Destination, choose Amazon Redshift
- Click Create, then you will move onto more detailed configuration.
- Choose the sensor data stream you already created in Source settings
- Choose the Redshift cluster you created in the previous steps
- Enter admin user and password you created earlier
- Enter dev for the database name
- Enter sensors for the table name
- Enter sensor_value,device_id,timestamp in the Columns field
- Kinesis needs an S3 bucket to collect records before inserting into Redshift. Create it now.
- Kinesis performs a COPY command to insert data into Redshift..In the COPY command options, enter json 'auto'
- Click Create delivery stream to finish
Allow access to Redshift from Kinesis
We need to ensure that public access is enabled, as Kinesis operates outside of your VPC.
- Open your cluster in the AWS console.
- At the top click Actions and then Modify publicly accessible setting and ensure this is enabled, then click Save Changes
As a final step we need to allow IP access to your Redshift cluster from the Kinesis IP range.
- Still on the cluster, open the Properties tab
- Scroll down to the VPC Security Group and open it
- Click Edit inbound rules
- Click Add rule
- Create a rule that allows all traffic from a source of 52.70.63.192/27
- Click Save rules
Results
- Ensure data is flowing from your device(s)
- Open the Query Editor V2 from the cluster main page in the AWS Console
- Run the query
SELECT * FROM “sensors”;
as below, from the “dev” database.
You should start to see a list of sensor values appearing in your data warehouse.
Conclusion
In this tutorial, I’ve shown you one way to connect Blecon-enabled Bluetooth devices to a data warehouse.
This architecture is scalable to thousands of sensor readings per second and petabytes of storage. Applications of this pattern include
- Building a service that allows your customers to track physical metrics over time
- You want to train an AI model and need to amass a large amount of real-world physical data first
- Gathering physical analytics across releases, process changes or batches to support product decision-making or compliance
This is just one example; Blecon enables low-cost generic sensors or custom hardware to integrate with any cloud-native system efficiently and securely. See https://blecon.net for more.