Publication

KAIST Introduction to Reinforcement Learning · Project · 2606

Korean 4-Ball Billiards: A Continuous, Deterministic, Sparse-Reward Benchmark Solved by Inference-Time Search

Doyeol Oh, Byungmo Kang, Seojun Park

We cast Korean 4-ball billiards (sagu) as a continuous, deterministic, sparse-reward RL benchmark with a fast, exact physics simulator. Off-policy RL (SAC/TD3) beats PPO by ~2.5x, but the bare task plateaus below one point per inning even with more compute, a curriculum, or a learned reward model. Explicit geometry — a first-contact aim constraint plus four carom features — lifts SAC from 0.487 to 6.460 points/inning. The final jump comes from inference-time search: greedy depth-2 lookahead using the simulator as its own verifier turns a ~6-point policy into chains of up to 9,392 consecutive scoring shots at 99.8% per-shot success.

A random rack of the cue ball, opponent cue ball, and two red balls on the Korean 4-ball table simulator

KAIST Automated Software Testing · Project · 2606

ASTraMut: Learning Java Mutation Operators from Real-World Bug-Fix Patterns

Doyeol Oh, Hyunji Park, Junseo Jang, Dongjae Lee

A Java mutation operator learner that mines AST-level edit patterns from real bug-fix commits, generalizes them via anti-unification, and reverses the learned fixes into mutation operators. On Defects4J, the merged top-100 operator set scores a 1.94% lower mutation score than PIT's default operators on the same relevant test suites, while covering bug-shape families — API renames, whole-predicate negation, block-shape rewrites — that PIT's fixed mutator catalogue cannot express.

Venn diagram of the overlap between ManySStuBs4J, Bugs2Fix, and PIT default mutation operators

KAIST Introduction to Artificial Intelligence · Assignment 3 · Pacman Competition Award · 2605

Capture-the-Flag Pacman with Self-Play Tuned Heuristics

Doyeol Oh

A two-vs-two CTF Pacman team built on classical search — goal-commit A* offense, alpha-beta minimax defense, 42-feature linear evaluator — plus a held-out verification protocol designed to defeat zoo-overfitting. The contribution is treating the student round-robin as an unseen distribution head-on, generalizing via hand-inspectable weights and external anchors instead of deep RL.

KAIST Introduction to Artificial Intelligence · Assignment 2 · 2604

Multi-Agent Search for Pacman: Reflex, Minimax, and Alpha-Beta

Doyeol Oh

An analysis that disentangles three effects commonly conflated in adversarial Pacman: action ordering has two dimensions (pruning efficiency vs tie-breaking), minimax is brittle against random ghosts via pessimism cascade rather than evaluation quality, and on trapped layouts the −1 living penalty creates a "swift-death preference" that makes deeper search rush a ghost.

KAIST Introduction to Artificial Intelligence · Assignment 1 · 2603

Graph Search for Pacman: DFS, BFS, UCS, and A*

Doyeol Oh

DFS / BFS / UCS / A* on the CS188 framework, plus a custom admissible heuristic — Blockage Detection + Tarjan articulation-point Portal Detection + dead-end peeling — that expands 34.4% fewer nodes than Manhattan on average. Per-call preprocessing made wall-clock time worse for single queries, a clean illustration of the search-quality vs evaluator-cost tradeoff.

UNIST Machine Learning · Final Project Report · 2512

SKiP: SVM weighted by K-Nearest-Neighbors and class Probability for weakening outliers

Doyeol Oh, Jeonghoon Park, Jaemin Kim, KangJun Lee

A weighted soft-margin SVM with slack penalty C_i = C · (p_i + n_i) / 2, where p_i is a class-conditional Gaussian probability (catching feature outliers) and n_i is a KNN label-consistency score (catching label outliers). The novelty is the additive aggregation: a multiplicative form collapses when either signal breaks (e.g. the Gaussian assumption on Titanic), while the average lets the surviving signal carry the weight.

Korean Database Conference (KDBC) 2025 · 2511

Entropy-Guided Adaptive Label Propagation for Location-Aware Graph Clustering

Doyeol Oh, Hyewon Kim, Dahee Kim, Junghoon Kim

An adaptive label propagation for LBSN where each node's structural-vs-spatial weight is α = 1 − H/log|L|, derived from the entropy of its neighbor labels. When neighbors agree, the Jaccard structural term dominates; when they disagree, the Haversine spatial term takes over — visually separating structurally connected but geographically distant cities (e.g. Nashville vs. Atlanta).

UNIST Introduction to Algorithms · Best Paper Award · 2506

Hylos: Hierarchically Localized Optimization Strategy for TSP

Doyeol Oh

A four-stage hierarchical TSP solver: k-means partitions cities into clusters of size ≤ 22 so Held-Karp becomes feasible, then both inter-cluster and intra-cluster tours dispatch by size between Held-Karp and Christofides, with a final entry/exit alignment to minimize cluster-boundary transitions. On mona-lisa100k it is ~8× faster than Christofides at ~2% lower cost. Won UNIST CSE331 Best Paper Award.

UNIST Introduction to Algorithms · Assignment 1 · 2504

Comparative Study of Twelve Sorting Algorithms

Doyeol Oh

A C++ benchmark of twelve sorts across random / sorted / reverse / partial inputs from 10³ to 10⁶. Two findings worth keeping: vanilla Lomuto Quick crashes on sorted input from unbalanced recursion (median-of-three pivoting is practically required), and a naive multithreaded Tim variant ran slower than single-threaded Tim because thread-creation overhead dominated the merge gain.

ICROS (Institute of Control, Robotics, and Systems) 2024 · 2407

Development of Intuitive Steering Mechanism for Hands-Free Operation of Indoor Shared Mobility

Donghoon Nam, Doyeol Oh, Seongjae Lee, Yunjeong Gwak, Huisung Lee

A chair-shaped indoor mobility with hands-free steering: a potentiometer reads saddle rotation and an STM32F303RE drives a PID-controlled steering motor, and the throttle is replaced by a kick-to-start scheme. Both hands and feet stay free while moving, and the form factor lets you sit and rest the moment you stop.

Patent

2602

System and Method for Spatially Proximate Community Detection Based on Entropy-Weighted Adaptive Label Propagation

KR 10-2026-0027653 · Filed 2026-02-11

Papers I've been working on,

Publication

Korean 4-Ball Billiards: A Continuous, Deterministic, Sparse-Reward Benchmark Solved by Inference-Time Search

ASTraMut: Learning Java Mutation Operators from Real-World Bug-Fix Patterns

Capture-the-Flag Pacman with Self-Play Tuned Heuristics

Multi-Agent Search for Pacman: Reflex, Minimax, and Alpha-Beta

Graph Search for Pacman: DFS, BFS, UCS, and A*

SKiP: SVM weighted by K-Nearest-Neighbors and class Probability for weakening outliers

Entropy-Guided Adaptive Label Propagation for Location-Aware Graph Clustering

Hylos: Hierarchically Localized Optimization Strategy for TSP

Comparative Study of Twelve Sorting Algorithms

Development of Intuitive Steering Mechanism for Hands-Free Operation of Indoor Shared Mobility

Patent

System and Method for Spatially Proximate Community Detection Based on Entropy-Weighted Adaptive Label Propagation