The Dataset Development Platform to Accelerate Materials Discovery for Built Environment Applications is a proof of concept designed to collect longitudinal environmental data. This data provides insights into the relationships between facade material properties, facade geometry, and light absorption. The physical prototype consists of an Arduino Nano 33 BLE Sense Rev2 with headers, a SparkFun Air Velocity Sensor Breakout, and an Adafruit MLX90640 IR Thermal Camera Breakout. To complete the platform ecosystem, a laptop receives data transmitted from the prototype via a serial connection. A Python script provides this capability.
A CSV file and heatmap images are the primary artifacts generated from the data collected by the sensor suite. This data includes temperature (°F), humidity (%), ambient light (lux), and airflow (mph). The thermal array records the temperature of each IR in Fahrenheit (°F). The system continuously logs environmental conditions and thermal array readings.
A 3D printer creates the prototype using fused deposition modeling. This process involves extruding material layer by layer to create a three-dimensional object. Each component of the prototype uses 3DXTech’s 3DXSTAT ESD-PLA filament, a compound specifically designed to protect against electrostatic discharge.
Investigating the necessary data for AI-driven approaches to material discovery and development
Summary
Off-the-shelf hardware and 3D printing provide affordable data collection for deep learning
This project represents an inquiry into building a dataset relevant to applying deep learning to material discovery. The potential datasets created from affordable, off-the-shelf hardware may enable researchers to better understand the environmental factors that influence the performance of materials in hyperlocal contexts. This supports the goal of improving energy efficiency through either the application of new materials or tactical modifications to existing structures.
One of the primary challenges is creating a consistent dataset based on crowdsourced environmental microconditions across a variety of contexts. This requires not only capturing data on ambient variables but also ensuring that the data collection methods are reliable, standardized, and resilient to various environmental noise. There are three primary goals driving this project: (1) implementing a design system that’s flexible and scalable enough to adapt to new technologies and methods of data collection, (2) designing the device to be secure and adaptable to a wide range of environments, and (3) developing a library of 3D-printable mounting components to promote accessibility and customization.
Based on these goals, an iterative approach provided an opportunity to evaluate the relationships between prototype design, environment condition variables, and diverse contexts. This prototype focused conditions including temperature, humidity, wind speed, and ambient light. in a rural context in the southeastern United States.
Role | Research, Product design, Digital Fabrication
Timeline | February 2024 to April 2024
Discovery
The primary influence on this project is the GNoMe project from Google DeepMind. GNoMe demonstrates the potential of using AI to discover and develop new materials at scale. The GNoMe system uses a Graph Neural Network and a training process called ‘active learning’ [1]. Graph neural networks apply the predictive power of deep learning to rich data structures that represent objects and their relationships as points connected by lines in a graph [2]. In the context of the GNoMe project, active learning corresponds to Bayesian Optimization, a method that optimizes black-box functions with minimal functional evaluations to discover the structure with an optimal property value [3].
Material costs are also impacting housing affordability. Prices for building materials have accelerated since the beginning of 2024. The factors contributing to the rise in material costs include supply and demand, inflation, global factors, and sustainability initiatives (Updated to reflect the newest data)[4].
Project Goals
Since the Dataset Development Platform to Accelerate Materials Discovery for Built Environment Applications is, context specificity had an influence on project goals. Utilizing secondary research and to ensure accessibility, the project employed the SMART acronym, which stands for Specific, Measurable, Achievable, Relevant, and Time-bound.
Goal one | Implement a design system that's flexible and scalable enough to adapt to new technologies and methods of data collection.
Goal two | Design the monitoring station to be secure and adaptable to a wide range of environments
Goal Three | Develop a library of 3D-printable mounting components to promote accessibility and customization.
In the long term, the development of a system will focus on the capability of autonomous material synthesis. In addition to environmental variables, geometric variables and variables based on human-building interaction are also suitable for inclusion [5].
Hardware Components
The proof of concept utilized off-the-shelf components, including an Arduino Nano 33 BLE Sense Rev2, a SparkFun Air Velocity Sensor Breakout, and an Adafruit MLX90640 IR Thermal Camera Breakout.
Adafruit STEMMA QT / Qwiic JST SH 4-pin cables manage the flow of data between the microcontroller and breakout boards. Two cables (QT to QT, 50mm long) connect the SparkFun Air Velocity Sensor Breakout to the Adafruit MLX90640 IR Thermal Camera Breakout. The next cable (QT to Female Sockets) connects the thermal camera to the Arduino Nano 33 BLE Sense. Data flow occurs through the serial data line (SDA) and serial clock line (SCL). The microcontroller connects to a laptop via USB.
Arduino Nano 33 BLE Sense Rev2
In this proof of concept, the Arduino Nano 33 BLE Sense Rev2 functions as a flexible microcontroller with the potential integration of Bluetooth Low Energy connectivity.
The sensors on the board include the APDS9960 sensor for gesture, light, proximity, and color, and the HS3003 for temperature and humidity. The LPS22HB sensor for barometric pressure is also available.
View details of the Arduino Nano 33 BLE Sense Rev2→
SparkFun Air Velocity Sensor Breakout
The SparkFun Air Velocity Sensor Breakout plays a pivotal role in this proof of concept. This surface-mounted sensor measures wind spend for understanding the impacts on construction materials.
For this sensor, it is important to note the measurement range. This proof of concept uses the FS3000-1005 version, which has a range of 0-7.2 m/s (0-16.2 mph). The other option is the FS3000-1015, which has a range of 0-15 m/s (0-33.6 mph). The accuracy is 5% of the full-scale flow range. The sensor operates with an input voltage of 2.7 to 3.3 V, and the average current draw is 10 mA.
Continue reading about the SparkFun Air Velocity Sensor Breakout →
Adafruit MLX90640 IR Thermal Camera Breakout
The Adafruit MLX90640 IR Thermal Camera Breakout is essential to the hardware setup of this proof of concept. This small camera supports a 24×32 array of IR thermal sensors. This results in 768 individual temperature readings.
The board measures a range from -40°F to 572°F (-40°C to 300°C) with an accuracy of ±3.6°F (±2°C) in the 32°F-212°F (0°C-100°C) range. It has a maximum frame rate of 16Hz. The sensor’s typical current consumption is 23 mA in normal mode.
Continue reading about the Adafruit MLX90640 IR Thermal Camera Breakout →
Material Selection
3DXSTAT ESD-PLA 3D Printer Filament
ESD-safe filament forms the basis of the prototype’s three components. These include the prototype carrier, the microcontroller carrier, and the breakout carrier.
The surface resistance of the printed ESD-PLA part will vary depending on the printer’s extruder temperature. For the device, the temperature setting was 220°C. It is important to note the abrasive characteristics of this filament when using a Bowden tube system.
Continue reading about 3DXSTAT ESD-PLA 3D Printer Filament →
Design
Iterative Design
The design of the prototype for the Dataset Development Platform to Accelerate Materials Discovery for Built Environment Applications focuses on an iterative approach. Each iteration has its strengths and weaknesses. All three iterations are cylindrical in nature. The first two iterations focus on a device that generates data remotely with low energy consumption, providing the opportunity to use alternative communication protocols.
Iteration One
The first iteration focused on creating a mounting system for the microcontroller only. In addition, this iteration explored different housings. The first housing consisted of a clear PVC pipe that measured 2-1/2″ (63.5 mm) in diameter. The second housing focused on creating a window for the thermal camera to collect data through.
In this iteration, the board carrier mounts directly to the internal structure. This provides the opportunity to orient the device to fit multiple contexts. This iteration also focused on creating a watertight seal so that the device can collect data in multiple environmental conditions.
Iteration Two
The second iteration of the prototype also focuses on the ability to collect data in multiple environmental conditions. In this iteration, seals create a watertight housing. To accommodate mounting contexts, this iteration integrates bearings to adjust the positioning of the camera.
This iteration minimizes the opportunity to introduce new sensors. In addition, this iteration began to examine the role of different mounting types. For example, action cameras, such as GoPro, use a standard type of mounting bracket for positioning in multiple contexts and orientations.
Iteration Three
This iteration of the prototype device focused on positioning mechanisms in a similar fashion to the previous iteration. The prototype embraced the action camera-style mount. In addition, some elements used an alternative filament: the Armadillo 3D Printer Filament (75D) from NinjaTek. This is a rigid TPU filament that is similar properties to nylon.
The housing is also an exploration of different filament types. This iteration’s housing uses transparent PLA from Hatchbox. Based on the number of walls in the housing, this appeared more translucent in nature. This approach provided the opportunity to orient the openings exactly to the sensor type.
The housing is also an exploration of different filament types. The housing in this iteration uses transparent PLA from Hatchbox. Based on the number of walls in the housing, it appeared more translucent in nature. This approach provided the opportunity to orient the openings exactly to the sensor type.
Iteration Four
Iteration Four minimized some elements from the previous iterations. The most significant change is the removal of the housing. This prevents data collection in some environmental conditions; however, it allows the ability to adjust sensor types without worrying about access points in the housing. The carrier provides the opportunity to add and remove breakout board carriers as necessary.
In addition, this iteration focused on creating a dynamic mounting system. This system focuses on a tripod mount. The mount contains a nut that screws onto a tripod with a 1/4-20 UNC bolt size.
To further promote adaptability in sensor and microcontroller selection, the microcontroller carrier is friction-fit. This flexibility allows the prototype to accommodate changes in technology.
The breakout board carrier further promotes adaptability. The grid draws inspiration from the Adafruit Swirly Aluminum Mounting Grid for 0.1″ spaced PCBs. The breakout board carrier is friction-fit, allowing for the placement of different breakout boards.
The placement of the breakout board carrier also provides flexibility in maintaining proper orientation to monitor environmental conditions.
Code
As highlighted in the summary, the code generation for this proof of concept was a collaborative effort with OpenAI’s Chat GPT-4. The process consisted of two key components: an Arduino sketch to read data from the onboard sensors—the SparkFun Air Velocity Sensor and the Adafruit MLX90640 IR Thermal Camera—and a Python script to record data via serial communication and generate heatmaps.
Reading Available Sensors
#include <Adafruit_MLX90640.h>
#include <Arduino_HTS221.h>
#include <PDM.h>
#include <cmath>
#include <Arduino_APDS9960.h>
Adafruit_MLX90640 mlx;
float frame[32*24]; // buffer for full frame of temperatures
short sampleBuffer[256];
volatile int samplesRead;
unsigned long lastRecordTime = 0;
const unsigned long recordInterval = 30000; // 30 seconds
void setup() {
Serial.begin(9600);
while (!Serial);
Serial.println("Initializing sensors...");
// Initialize MLX90640
if (!mlx.begin(MLX90640_I2CADDR_DEFAULT, &Wire)) {
Serial.println("MLX90640 not found!");
while (1);
}
Serial.println("MLX90640 initialized successfully.");
mlx.setMode(MLX90640_CHESS);
mlx.setResolution(MLX90640_ADC_18BIT);
mlx.setRefreshRate(MLX90640_2_HZ);
// Initialize other sensors
if (!HTS.begin()) {
Serial.println("Failed to initialize HTS221 sensor!");
while (1);
} else {
Serial.println("HTS221 sensor initialized successfully.");
}
if (!APDS.begin()) {
Serial.println("Failed to initialize APDS9960 sensor!");
while (1);
} else {
Serial.println("APDS9960 sensor initialized successfully.");
}
PDM.onReceive(onPDMdata);
if (!PDM.begin(1, 16000)) {
Serial.println("Failed to start PDM!");
while (1);
} else {
Serial.println("PDM started successfully.");
}
}
void loop() {
unsigned long currentTime = millis();
if (currentTime - lastRecordTime >= recordInterval) {
// Read temperature and humidity
float temperature = HTS.readTemperature(FAHRENHEIT);
float humidity = HTS.readHumidity();
Serial.print("HTS221 - Temperature = "); Serial.print(temperature); Serial.println(" °F");
Serial.print("HTS221 - Humidity = "); Serial.print(humidity); Serial.println(" %");
// Read MLX90640 temperatures
if (mlx.getFrame(frame) == 0) {
Serial.println("MLX90640 - Temperatures (°F):");
for (uint8_t h = 0; h < 24; h++) {
for (uint8_t w = 0; w < 32; w++) {
float celsius = frame[h*32 + w];
float fahrenheit = celsius * 1.8 + 32;
Serial.print(fahrenheit, 1);
Serial.print(", ");
}
Serial.println();
}
} else {
Serial.println("Failed to read temperature frame from MLX90640.");
}
// Read color and light
if (APDS.colorAvailable()) {
int r, g, b, a;
APDS.readColor(r, g, b, a);
float lux = a * 0.50; // Coefficient needs calibration
float colorTemperature = calculateColorTemperature(r, g, b);
Serial.print("APDS9960 - Lux: "); Serial.print(lux);
Serial.print(", Color Temperature: "); Serial.println(colorTemperature);
}
// Log sound data
if (samplesRead > 0) {
float db = calculateDecibel();
Serial.print("PDM - Sound Level = "); Serial.print(db); Serial.println(" dB");
samplesRead = 0; // Reset the sample read count for the next loop
}
lastRecordTime = currentTime; // Update the time for the next interval
}
}
void onPDMdata() {
int bytesAvailable = PDM.available();
PDM.read(sampleBuffer, bytesAvailable);
samplesRead = bytesAvailable / 2; // Convert bytes to number of 16-bit samples
}
float calculateDecibel() {
long sumOfSquares = 0;
for (int i = 0; i < samplesRead; i++) {
sumOfSquares += (long)sampleBuffer[i] * (long)sampleBuffer[i];
}
float rms = sqrt(sumOfSquares / (float)samplesRead);
float referenceRMS = 0.0355; // Calibration value
return 20 * log10(rms / referenceRMS);
}
float calculateColorTemperature(int r, int g, int b) {
float red = r / 255.0;
float green = g / 255.0;
float blue = b / 255.0;
float X = red * 0.4124564 + green * 0.3575761 + blue * 0.1804375;
float Y = red * 0.2126729 + green * 0.7151522 + blue * 0.0721750;
float Z = red * 0.0193339 + green * 0.1191920 + blue * 0.9503041;
float x = X / (X + Y + Z);
float y = Y / (X + Y + Z);
float n = (x - 0.3320) / (0.1858 - y);
return 449.0 * n * n * n + 3525.0 * n * n + 6823.3 * n + 5520.33;
}
CSV File and Thermal Image Heatmap Creation
import serial
import csv
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
# Configuration
serial_port = 'COM8'
baud_rate = 9600
thermal_resolution = (24, 32) # Thermal camera's resolution
csv_file_path = 'sensor-data.csv' # Path to the CSV file
# Save heatmap every 1 minute
save_interval = timedelta(minutes=1)
# Initialize serial connection
ser = serial.Serial(serial_port, baud_rate, timeout=1)
def save_heatmap(data, timestamp):
plt.imshow(data, cmap='coolwarm', interpolation='nearest')
plt.colorbar(label='Temperature (°F)')
plt.title(f'Thermal Camera Heatmap - {timestamp}')
heatmap_filename = f'heatmap_{timestamp}.png'
plt.savefig(heatmap_filename)
plt.close()
print(f"Saved heatmap as {heatmap_filename}")
def process_thermal_data(thermal_data_lines):
try:
flattened_data = [float(temp.strip()) for line in thermal_data_lines for temp in line.split(',') if temp.strip()]
if len(flattened_data) == np.prod(thermal_resolution):
return np.array(flattened_data).reshape(thermal_resolution)
else:
print("Data mismatch or incomplete thermal data.")
except ValueError as e:
print(f"Error processing data: {e}")
return None
thermal_data_buffer = []
last_save_time = datetime.now()
collecting_thermal = False
try:
print("Starting data collection... Press Ctrl+C to stop.")
while True:
line = ser.readline().decode('utf-8').strip()
timestamp = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
if "MLX90640 - Temperatures" in line:
collecting_thermal = True
continue
if collecting_thermal:
if line and ',' in line: # Collect only valid thermal data lines
thermal_data_buffer.append(line)
if len(thermal_data_buffer) == thermal_resolution[0]:
collecting_thermal = False
thermal_data = process_thermal_data(thermal_data_buffer)
if thermal_data is not None and datetime.now() - last_save_time >= save_interval:
save_heatmap(thermal_data, timestamp.replace(':', '-'))
last_save_time = datetime.now()
thermal_data_buffer.clear()
continue
# Process non-thermal data only if the line is not empty and not collecting thermal data
if line and not collecting_thermal:
with open(csv_file_path, mode='a', newline='') as file:
writer = csv.writer(file)
writer.writerow([timestamp, line])
print(f"Logged non-thermal data: {line}") # Diagnostic print
except KeyboardInterrupt:
print("Data collection stopped.")
ser.close()
The data exists in CSV format without any data filtering. In this image, the complexity of the thermal camera is apparent. There are 768 individual temperature readings.
The Python script outputs an image based on these values. In the heatmaps below, the differences between materials are apparent. This represents residential construction made of brick and a window made of glass.
In addition to the primary structure, secondary objects also interact with the environmental conditions.
Closing Thoughts
When considering the availability of data sources, the properties of existing materials are another important factor. Existing materials also have an established knowledge base. The relationship between emerging materials, workforce training, and installation processes needs analysis. There may be an opportunity to quantitatively analyze these relationships. One element that also needs resolution is the issue with I2C communication addresses. A standalone SparkFun Ambient Light Sensor Breakout has the same address as the SparkFun Air Velocity Sensor Breakout. In this instance, lighting conditions analysis is available through the Arduino Nano 33 BLE Sense. There may be a need for analysis to compare the accuracy of lighting values between the two sensors. This also raises questions regarding the orientation of the microcontroller and associated sensors. Data filtering algorithms are also appropriate. For continuous outdoor use, another material is more appropriate. PLA tends to break down and warp under direct sunlight in hot environments. This also provides an opportunity to examine low-power modes for continuous remote monitoring. This also corresponds to creating a watertight housing while remaining flexible in sensor implementation.
References
[1] Merchant, A., Batzner, S., Schoenholz, S.S. et al. Scaling deep learning for materials discovery. Nature 624, 80–85 (2023). https://doi.org/10.1038/s41586-023-06735-9
[2] Merritt, R. (2022) What are graph neural networks?, NVIDIA Blog. Available at: https://blogs.nvidia.com/blog/what-are-graph-neural-networks/
[3] Bassman Oftelie, L., Rajak, P., Kalia, R.K. et al. Active learning for accelerated design of layered materials. npj Comput Mater 4, 74 (2018). https://doi.org/10.1038/s41524-018-0129-0
[4] Strong, A. Material Costs Affect Housing Affordability. National Association of Home Buildings. Available at: https://www.nahb.org/advocacy/top-priorities/material-costs
[5] Becerik-Gerber, B., Lucas, G., Aryal, A. et al. The field of human building interaction for convergent research and innovation for intelligent built environments. Sci Rep 12, 22092 (2022). https://doi.org/10.1038/s41598-022-25047-y