Available for new opportunities

Industrial Data Engineer
& AI Solutions Builder

I bridge the gap between raw industrial sensor data and actionable intelligence — designing scalable data pipelines, predictive maintenance models, and AI-powered dashboards for heavy industries.

4+
Years in Industrial Data & Quality Engineering
AI
Predictive Maintenance & Anomaly Detection
Global
Cross-border R&D & Stakeholder Coordination

Impact Snapshot

Industrial AI that survives contact with the field

The common thread across my work is turning messy operational data into systems engineers can trust, maintain, and improve.

Reliability

Move from dashboards to decisions

I design analytics that connect model outputs to field actions: what changed, why it matters, and what a service or quality team should do next.

Data Product

Make industrial data usable

I translate equipment behavior into schemas, features, validation rules, and notebooks that data scientists and domain engineers can both reason about.

Execution

Bridge R&D, field, and delivery

I work across Japanese and English technical teams, keeping requirements, data definitions, experiments, and deployment constraints aligned.

Best fit Industrial AI, IIoT platforms, predictive maintenance, quality analytics
Strength Domain-grounded data engineering from raw telemetry to model-ready features
Working style Hands-on builder, careful communicator, production-minded collaborator

Turning Industrial Data
into Operational Intelligence

I'm an engineer working at the intersection of data, AI, and product reliability — building and supporting predictive maintenance, quality analytics, and design-verification workflows for globally deployed precision instruments in a regulated industry.

At Hitachi High-Technologies, I contribute to fleet-scale AI predictive maintenance development with international R&D partners, drive root-cause analytics on operational telemetry, and act as the bridge between equipment domain knowledge and the data-science teams that turn it into models.

I'm passionate about bridging the OT / IT gap in regulated, mission-critical industries, and I thrive in environments where data engineering and hands-on domain expertise combine to drive real-world reliability and cost outcomes.

Predictive Maintenance Industrial IoT Time-Series Analysis Anomaly Detection Quality Analytics Cross-border Collaboration AI / ML

Domain Expertise

Hands-on understanding of operational data from globally deployed precision instruments — failure modes, sensor characteristics, and end-user maintenance workflows.

Full-Stack Data Engineering

From raw sensor ingestion (REST/Kafka) to data modeling, cleansing, transformation, and visualization — I own the full pipeline.

AI-Augmented Solutions

Building ML models and AI agents that translate data patterns into actionable maintenance decisions, not just dashboards.

Cloud & DevOps Mindset

CI/CD-first development with Docker, GitHub Actions, and Azure — building solutions that scale beyond the initial pilot.

Technical Skills

Stack & Tooling

Built around the industrial data & AI engineering lifecycle — from edge sensor to executive dashboard.

Languages

Python SQL Bash Java (basic)

Data & ML

pandas / NumPy scikit-learn XGBoost / LightGBM LSTM / PyTorch Apache Spark Kafka

Cloud & Infrastructure

Azure (ADF, Synapse, AKS) AWS (S3, Lambda) GCP (BigQuery) Docker Kubernetes

Data Visualization

Plotly Dash Grafana Power BI Matplotlib / Seaborn

Integration & APIs

REST APIs GraphQL OPC-UA / MQTT SQL extractors Custom extractors

DevOps & Quality

Git / GitHub GitHub Actions (CI/CD) pytest Logging / Monitoring Data modeling

Experience

Career Timeline

Building industrial intelligence through data engineering and predictive analytics.

April 2022 — Present
Application & Quality Engineer — Industrial AI / Data Analytics
Hitachi High-Technologies Corporation · Tokyo, Japan
  • Contributed to fleet-scale AI predictive maintenance development for globally deployed precision instruments, partnering with an international R&D collaborator on data-sharing governance, schema definition, and feature design — translating equipment domain knowledge into ML inputs and supporting product-grade deployment of the resulting models.
  • Owned operational telemetry analytics on a globally deployed product fleet, combining statistical analysis (FTA, fishbone, multivariate inspection of ~50 quality channels) with on-site root-cause investigation; drove an order-of-magnitude reduction in out-of-spec rate on a key process metric.
  • Co-developed an automation robotics initiative for clinical-workflow tasks: facilitated cross-functional design reviews, designed and analysed a quantitative product-evaluation survey, and was named co-inventor on a filed patent on human–robot collaboration.
  • Led design verification across multiple parallel themes in coordination with international R&D peers in Europe, running technical reviews in English; contributed data, calculation tooling, and regulatory documentation to a manufacturing-transfer programme that launched ahead of schedule.
  • Continuously deepen ML / AI skills outside core work — Kaggle competitions, completion of a graduate-school medical-AI programme, G-Test (JDLA Generalist) certification, and active engagement with academic conferences on medical AI and computer vision.

Featured Project

Predictive Maintenance on NASA CMAPSS

predictive-maintenance-cmapss

An end-to-end Python pipeline on NASA's CMAPSS turbofan degradation dataset — strict-schema data loader, feature engineering, baseline and gradient-boosted RUL regressors, with executed Jupyter notebooks showing every result.

96%test coverage
59tests passing
3.11–3.13Python matrix CI
MITlicensed
Python scikit-learn XGBoost pandas Plotly Jupyter GitHub Actions uv ruff / mypy

Results from the executed notebooks

Real benchmarks on FD001

Every figure below is rendered straight from the executed notebook in the repository — click any card to open the full notebook on GitHub.

Sensor 11 aligned to failure across 25 training units

EDA — trajectories aligned to failure

25 training units, sensor 11 plotted vs. cycles before failure. Clean monotonic drift in the last ~80 cycles motivates the piecewise-linear RUL relabelling.

RUL distribution before and after piecewise-linear clipping at 125 cycles

RUL distribution — raw vs. clipped

Capping the regression target at 125 cycles concentrates model capacity on the regime where degradation is observable. Heimes (2008) convention.

Ridge baseline predictions vs true RUL on FD001 test set

Ridge baseline — RMSE 18.27 / S-score 592.6

Standard-scaled L2 regression on rolling features. Sets the floor that any non-linear model must clearly beat to justify its complexity.

Ridge baseline vs XGBoost predictions side by side

XGBoost head-to-head — RMSE 18.23 / S-score 814.8

Tuned XGBoost edges out Ridge on RMSE but loses on the asymmetric S-score. The honest result: FD001's single regime is exactly where linear features compete with trees.

Top 15 XGBoost features by gain

Feature importance corroborates the EDA

Rolling statistics on high-pressure-compressor sensors dominate — exactly the channels whose drift was visible in the EDA notebook.

FD002 / FD004 multi-regime evaluation

Operating-regime clustering, regime-aware normalisation, and an LSTM sequence model over full trajectories — the setting where XGBoost is expected to clearly win.

Track progress on GitHub →

Interactive concept preview

Synthetic dashboard — live in your browser

A self-contained Plotly visualisation of what the same pipeline looks like running against real-time sensor streams. The data is synthetic so the page stays static — for the actual benchmark numbers, see the cards above.

Multi-Sensor Time Series — Degradation Monitoring

RUL Prediction — Actual vs. Predicted

Anomaly Detection — Operating State Classification

Equipment Health Score (Current)

Sensors Vibration, Temperature, Pressure
Cycles 300 operating cycles
Alert threshold Health score < 30%
Implementation eastani/predictive-maintenance-cmapss ↗

How I Work

From equipment signals to operational AI

A practical workflow for industrial data projects where model quality, domain validity, and deployment constraints all matter.

01

Frame the operating problem

Start from failure modes, maintenance workflows, and business impact before touching the model. The target is a decision, not a chart.

02

Build trustworthy data contracts

Define schemas, sensor semantics, validation checks, and reproducible datasets so experiments are explainable and repeatable.

03

Validate against reality

Compare simple baselines, inspect feature behavior, and test whether the result matches equipment physics and field intuition.

04

Package for adoption

Turn notebooks into tested pipelines, dashboards, alerts, and documentation that service, quality, and R&D teams can actually use.

Architecture

Data Pipeline Design

End-to-end industrial data flow aligned with Cognite Data Fusion's integration model.

01 — Data Sources & Ingestion
Equipment
Sensors / PLC / DCS
📶
OPC-UA / MQTT
Edge Protocol
Custom Extractor
Python + Docker
📊
Kafka / REST
Event Streaming
02 — Processing & Contextualization
Data Lake
Azure ADLS Gen2
🔨
Transformation
Spark / Databricks
📋
Data Modeling
Graph + Relational
🌐
CDF / Data Fusion
Contextualized Assets
03 — AI / Intelligence Layer
🧠
ML Models
RUL / Anomaly Detection
🤖
AI Agent
GenAI + LLM
📌
Alert Engine
Threshold + Rule-based
📍
Dashboard
Plotly Dash / Grafana
04 — CI/CD & Operations
🔄
GitHub Actions
CI/CD Pipeline
📡
Container Registry
Azure ACR / Docker
👀
Monitoring
Azure Monitor / Grafana

Contact

Target roles Data Engineer / Data Scientist / Industrial AI Engineer
Location Tokyo / remote-friendly global teams
Languages Japanese / English technical communication

Let's Build Something
Industrial & Intelligent

Open to Data Engineer / Data Scientist opportunities in industrial AI and IIoT platforms. Let's connect.