mep · PIPELINE

The eval-and-policy-composition pipeline

Every stage from an IFC building model to a surface-coverage recall@K number: pose sampling, raycast capture, the K-step rollout, the joint learned scorer, the lookahead wrapper that actually ships, and the metric. Click any node for what it is, the real class/function behind it, and the committed file path.

The two amber FIX nodes are the methodology contribution: scanner-z anchoring + raycast floor filter in pose sampling, and the scene-full GT denominator in the recall metric. The second is a metric pathology that inflated oracle 5-50×. Fixing it dropped oracle on gni_model_173 from 1.0 to 0.0093.

hover a node → trace inputs (cyan) + outputs (amber) · click → open explainer drawer · FIX = methodology contribution

what this is

role / number

real class / function

DATA · IFC scene

SIMULATOR · Open3D raycast

ROLLOUT · K steps

MODEL · learned scorer

EVAL · per-step metric

OUTPUT · curves + headline

🏛

IFC scene file

manifest entry
typology + path

📋

manifest.json

scene_ids · splits
iou_threshold

📦

learned_ckpt.pt

JointInfoGain weights
+ sibling config.json

⚙

eval config

K=rollout steps
n_candidates · seed

📐

Open3DSimulator.load_scene

IFC → Scene
AABB · mesh · instance_class

🏗

Scene

aabb · mesh · graph
instance_class[i] = MEPClass

FIX

🎯

sample_feasible_poses

fixed scanner-z (MEP-anchored)
+ raycast floor filter

candidates ×N

📡

scan_from_pose

5-DoF raycast bundle
→ LabeledCloud (xyz,cls,inst)

🧩

partial_cloud + history

union of scans so far
+ chosen pose idx list

🧵

BaselineContext

scene · sim · extractor
score_fn

🎲

policy.pick

→ next pose idx

🔁

scan + union

partial' = partial ∪ scan
history.append(idx)

☁

final partial_cloud

after K steps
xyz · class_id · instance_id

🧮

GeometricExtractor

cloud → predicted
(pred_class, pred_instance)

📍

predicted instances

🧠

JointInfoGainPredictor

(partial_cloud + history,
candidates) → scores

FIX

📊

compute_mep_recall

Hungarian-matched, IoU≥τ
denom = scene.instance_class

RecallResult.overall

🧊

surface_coverage

vs dense-GT voxels

🔗

edge_recall

graph endpoints
≥min_pts both sides

📈

recall@K curves

per-policy · per-scene

✓

headline

recall@K

data / training flow derived signal / eval ● win · headline FIX · methodology contribution

Eval protocol: surface-coverage recall@K=1 · 5 seeds · n_cand=20 · hybrid_top_k=12 · mean ± SE across seeds. Hybrid = LearnedPlusLookaheadBaseline (model scores M candidates, takes top-K=12, runs 2-step lookahead). Classical baseline: OctoMap-IG (Bircher et al., ICRA 2016). The two FIX nodes (scanner-z anchoring + raycast floor filter, and the scene-full GT denominator) are the methodology contribution.