Backend / Systems / Applied AI Engineer

Shufeng Chen

M.S. Electrical Engineering at Columbia University. My work spans cloud-native services, distributed systems, and AI product engineering — with an emphasis on reliability, observability, and clean API design.

View Projects Get in Touch

Shufeng (Alex) Chen

Columbia University · M.S. EE

GoPythonK8s

Open to opportunities

What I Build

Focus Areas

Cloud Backend

Microservices, REST APIs, caching, data pipelines. Building scalable cloud-native services on Kubernetes with emphasis on reliability and observability.

KubernetesDockerRedis

Applied AI

LLM applications, retrieval-augmented generation, ML experimentation and evaluation. Turning research into production-grade AI products.

PythonOpenAI

Systems

Performance debugging, reliability engineering, and dev tooling. Distributed consensus, concurrency control, and low-level optimization.

Career Journey

Experience

Jul 2025 — Aug 2025

Software Engineer Intern

Tencent · Cloud Computing Group

Built backend microservices for distributed inference pipelines on Kubernetes-managed HPC clusters; automated container lifecycle & remote job operations for smoother orchestration.
Developed internal cluster management and scheduling APIs (Spring Boot/Security, Redis/MySQL) and integrated cloud metadata synchronization for lifecycle tracking.
Improved ingestion/telemetry services by shipping a CLI-based pipeline adopted by 20+ internal apps; enabled high-throughput streaming from 4,300+ edge nodes.
Deployed pre-embedding pipelines across Tencent Cloud and AWS (EKS/EC2), improving inference latency by 1.3× and reducing CPU load by 2.2× via caching and preprocessing.

KubernetesSpring BootRedisMySQLAWS EKSDocker

Sep 2024 — Nov 2024

Full Stack Software Engineer Intern

Tree-Graph Research Institute

Built a cross-platform crypto analytics app (React Native) with GPT-based insights; summarized 5,000+ news sources using LDA topic modeling.
Designed an automated ETL pipeline (Scrapy, AWS Lambda, DynamoDB, EventBridge) and indexed real-time data to OpenSearch; improved search API performance by 33%.
Shipped high-performance secrets-detection microservices (Go/Python/Rust) over gRPC + Kubernetes; reduced P95 latency from 170ms to 40ms and supported 10K+ QPS using Bloom-filter warm cache.
Implemented serverless remediation workflows (Step Functions + Lambda) and production observability (Prometheus/Grafana, structured logs).

GoPythonRustgRPCKubernetesAWS

Jan 2023 — Mar 2023

Full Stack Software Engineer Intern

VisionX LLC

Developed an ONNX-optimized, real-time environmental monitoring platform with depth-enabled cameras and on-device inference for fire/smoke and pedestrian detection.
Led delivery of a PDF Q&A system on AWS (SageMaker/App Runner/S3) with LangChain + React + Express; improved responsiveness via vector caching and prompt tuning.
Designed an MLOps rollout pipeline (offline → shadow → live) with Jenkins and Grafana for drift/latency monitoring in field deployments.

PythonReactAWSLangChainONNXJenkins

Featured Work

Projects

Selected repos from github.com/shufengc

QuantHarbor

PythonFastAPIReactRAG

GitHub

End-to-end AI financial research platform transforming unstructured market documents into citation-grounded insights.

Multi-Agent Architecture: Orchestrated specialized agents (Data Collector, Analyzer, Report Generator, Deep Search) collaborating in a shared variable space.
RAG Pipeline: Built a document-to-insight pipeline integrating PDF ingestion, vector indexing, and citation tracing for verifiable insights.
VLM Feedback Loops: Implemented built-in vision agents that automatically correct chart issues during publication-grade report generation.
Interactive UI: Delivered a full-stack dashboard for system configuration, execution monitoring via WebSockets, and checkpoint/resume management.

Distributed Key-Value Store

GogRPCMulti-RaftMVCC

GitHub

Horizontally scalable, fault-tolerant key-value storage system featuring strong consistency and distributed transactions.

Raft Consensus & Sharding: Implemented Multi-Raft for log replication, dynamic region splitting, and leader transfer without downtime.
Global Scheduler: Developed a heartbeat-driven scheduler to monitor cluster metadata and auto-rebalance replicas across nodes.
Percolator 2PC: Built distributed transaction support with Snapshot Isolation handling prewrites, commits, and rollbacks.
Concurrency Control: Managed concurrent operations via MVCC and per-key latching on an embedded LSM-tree engine.

Patton Food

JavaSpring BootReactPostgreSQL

GitHub

Full-stack, cloud-deployed online food ordering web application with clean API design and a modern UI.

User registration and login with session-based authentication using Spring Security.
Password encryption with BCrypt and secure session management.
Restaurant/menu browsing with cart and checkout flow.
Cloud deployment on AWS with production-ready configuration.

PDF-AI: Conversational Q&A System

TypeScriptReactLangChainOpenAI

GitHub

AI assistant that lets users upload PDFs and ask natural language questions about their content.

PDF upload & processing pipeline with chunking and vector indexing.
Retrieval + LLM answering with conversation context and source citations.
Responsive chat UI with streaming responses.
Latency optimizations via caching and request shaping.

L2 Game Engine

C++LuaSDLBox2D

GitHub

Cross-platform 2D game engine in C++ with Lua scripting — designed to keep the runtime fast, modular, and easy to extend.

C++ core runtime: real-time game loop, scene/actor system, and engine APIs optimized for iteration speed.
Lua scripting + language hosting: externalized gameplay logic with script-facing APIs for actor creation and behaviors.
Box2D physics: collision handling, rigid body dynamics, friction, and gravity for realistic 2D gameplay.
Shipped a complete game on the engine: Havoc: 300-Seconds Escape.

LinguAR: AR Language Learning

UnityC#ARKitGPT-4o

Demo

iOS AR app prototype combining real-scene recognition, translation, and multimodal tutoring for daily language practice.

Built the end-to-end AR learning loop: object recognition (YOLOX) → translation overlay (Google Cloud Translate) → pronunciation output (AWS Polly).
Led AR shooting mini-game feature implementation: fixed screen flicker, added SFX/VFX + scoreboard.
Integrated voice-first AI tutoring (Whisper STT + GPT-4o + TTS) for conversational practice with multimodal I/O.
Delivered a usability-tested team prototype with story maps and iterative refinements.

Interactive Work