Address Element Extraction - Shopee Code League 2021
🎯 Problem Statement Unstructured, incomplete-and often misspelled-Indonesian addresses make accurate geocoding for last-mile delivery a major challenge. In the Shopee Code League 2021 Data Science round, we were given: 300,000 training samples & 50,000 test addresses The task: Extract Point of Interest (POI) and Street from raw address text Enable downstream geocoding to optimize delivery routes and improve customer experience Raw address POI Street cipinang besar selatan lintas ibadah, cipi jaya 1a no 3 rw 7 13410 jatinegara lintas ibadah cipinang jaya 1a puri kemb timur None puri kembang timur 🔍 NER Training Pipeline We formulated POI/Street extraction as a token-level Named Entity Recognition (NER) problem: ...