15–27 September 2025 | Holiday Inn Resort Baruna Bali, Indonesia


Background

The era of Big Data has brought about significant transformations in the way governments, businesses, and societies make decisions. In the field of official statistics, the availability of new, large-scale, and real-time data sources provides unprecedented opportunities to enhance the quality, timeliness, and relevance of statistical products. Among the most promising sources is Mobile Positioning Data (MPD), generated by mobile network operators whenever subscribers interact with cellular networks.

MPD is particularly powerful because of its scale, frequency, and spatial detail. It can be used to generate insights on population mobility, tourism flows, commuting patterns, and urban dynamics, which are traditionally costly and time-consuming to measure through surveys. In Indonesia, MPD has already been successfully applied to produce tourism statistics, providing policymakers with evidence-based insights to support sustainable economic development.

Recognizing this potential, the Regional Hub on Big Data and Data Science for Asia and the Pacific, hosted by Statistics Indonesia (BPS) and Politeknik Statistika STIS, has initiated the Knowledge Development Series on MPD for Official Statistics. This series has two complementary dimensions:

  • Policy level capacity building through the Executive Training for Policy Makers held in Jakarta in July 2025, where senior officials from across the region discussed strategic directions, governance frameworks, and international case studies.
  • Technical level training, which is the focus of this short course in Bali, dedicated to equipping technical staff with the skills and tools required to transform raw MPD into meaningful and policy-relevant statistics

By combining strategic awareness at the leadership level with practical competencies among technical staff, the program aims to create a holistic ecosystem for MPD adoption in the region.

Objectives

This two-week short course has three overarching objectives, each aligned with the broader agenda of statistical modernization:

  1. Strengthen technical capacity. Participants receive intensive training in modern data science tools, including SQL, Python, and PySpark. The sessions are designed to build competencies in handling large MPD datasets, conducting data cleaning and transformation, and applying analytical techniques such as clustering, stop-spot analysis, and event detection.
  2. Deepen understanding of MPD applications. Beyond technical exercises, the course provides modules on how MPD can be integrated into official statistics, particularly in tourism, which is a priority sector for many partner countries. Participants explore practical applications such as home location detection, domestic and inbound tourism indicators, and ICT-related statistics.
  3. Promote regional cooperation. By bringing together national statistical offices, tourism authorities, and mobile network operators from multiple countries, the course fosters dialogue and knowledge-sharing. This regional approach ensures that the development of MPD methodologies is not isolated, but instead benefits from shared experiences and collaborative problem-solving

These objectives together lay the foundation for sustained progress, both at the national and regional levels.

Participants

The short course convened 14 technical staff from across Southeast Asia, representing three categories of institutions:

  • National Statistical Offices (NSOs) from the Philippines, Cambodia, and Timor-Leste.
  • Tourism authorities from the same three countries, ensuring that the outputs of MPD analysis are directly connected to policy needs in the tourism sector.
  • Mobile network operators in Timor-Leste, highlighting the importance of public–private collaboration in accessing and using MPD

Participants were carefully selected to ensure that they are technical professionals with backgrounds in ICT, statistics, and data processing. Their involvement guarantees that the knowledge acquired during the course will be directly relevant and immediately applicable in their institutions.

The diversity of participants also reflects the multi-stakeholder nature of MPD adoption: while NSOs provide statistical expertise, tourism agencies articulate sectoral needs, and MNOs supply the data infrastructure.

Programme

The training programme ran for 13 effective days, from 15 to 27 September 2025, structured into several progressive modules

Week 1: Data Science Foundations

  • Databases for Big Data Analytics: covering relational database systems, SQL fundamentals (DDL and DML), and advanced topics such as NoSQL and MongoDB.
  • Python for Big Data Analytics: introducing Python programming (via Google Colab), data cleaning and transformation, exploratory data analysis (EDA), and visualization techniques.
  • PySpark for Scalable Analytics: introducing distributed computing, PySpark DataFrame operations, window functions, and SQL integration.

Week 2: MPD-Specific Modules and Applications

  • Tourism Statistics and MPD: concepts and definitions of domestic and inbound tourism, introduction to MPD access procedures, ICT-related indicators, and methodologies for stop-spot and event analysis.
  • Survey Design and Weighting: to link MPD-derived indicators with traditional survey frameworks.
  • Country Group Work: participants worked in teams to prepare country-specific feasibility assessments on the potential integration of MPD into their statistical systems.
SHORT COURSE AGENDA
15 September - 27 September 2025
Date Time Topic Materials
Monday, 15 September 2025 08.00 – 09.00 Registration
09.00 – 09.15 Event Report
09.15 – 09.30 Welcome Speech
09.30 – 09.45 Opening Speech
09.45 – 09.50 Group Photo
09.50 – 10.10 Coffee Break
10.10 – 12.10 Databases for Big Data Analytics
- Getting started with Database
- Install RDBMS
Slides
12.10 – 13.30 Lunch Break
13.30 – 15.30 Databases for Big Data Analytics
- Querying Data with SQL (DDL)
- DDL Exercise
Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Databases for Big Data Analytics
- Querying Data with SQL (DML) - Part 1
- DDL Exercise
Slides
Tuesday, 16 September 2025 09.00 – 10.00 Databases for Big Data Analytics
- Querying Data with SQL (DML) - Part 2
- DML Exercise
Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Databases for Big Data Analytics
and Introduction to NoSQL
Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Databases for Big Data Analytics
MongoDB and NoSQL Exercise
Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Databases for Big Data Analytics
MongoDB and NoSQL Exercise
Slides
Wednesday, 17 September 2025 09.00 – 10.00 Python for Big Data Analytics
- Introduction to Google Colab
- Introduction to Python
- Write/Read Files
- Pandas Basic
Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Python for Big Data Analytics
- Data Cleaning & Transformation
- Exploratory Data Analysis (EDA)
Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Python for Big Data Analytics
Data Visualization using Python
Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Python for Big Data Analytics
Data Visualization using Python
Slides
Thursday, 18 September 2025 09.00 – 10.00 Python for Big Data Analytics
Data Visualization using Kepler.gl
Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Python for Big Data Analytics
- Introduction to PySpark
- PySpark DataFrame Operations
Slides , Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Python for Big Data Analytics
- Joining & aggregating using PySpark
- PySpark window function
- SQL using PySpark
Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Python for Big Data Analytics
DBSCAN Clustering
Slides , Slides
Friday, 19 September 2025 09.00 – 10.00 Concept and Definition of Tourism
Domestic & Inbound
Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Concept and Definition of Tourism
Domestic & Inbound
Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Introduction to MPD and MPD Access Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 MPD for ICT Indicators Slides
Saturday, 20 September 2025 09.00 – 10.00 Quality Assurance Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Quality Assurance Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Usual Environment Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Usual Environment Slides
Sunday, 21 September 2025 09.00 – 10.00 Stopspot Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Stopspot Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Domestic Tourism Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Domestic Tourism Slides
Monday, 22 September 2025 08.00 – 12.00 Visit to Badung Smart City
12.00 – 13.30 Lunch Break
13.30 – 15.30 Inbound Tourism Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Inbound Tourism Slides
Tuesday, 23 September 2025 09.00 – 10.00 Event Analysis
10.00 – 10.15 Coffee Break
10.15 – 12.15 Event Analysis Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Survey Design and Weighting Slides
15.30 – 15.45 Coffee Break
15.45 – 16.45 Survey Design and Weighting Slides
Wednesday, 24 September 2025 09.00 – 10.00 Tourism Statistics Slides
10.00 – 10.15 Coffee Break
10.15 – 12.15 Tourism Statistics Slides
12.15 – 13.30 Lunch Break
13.30 – 15.30 Group Presentation
15.30 – 15.45 Lunch Break
15.45 – 16.45 Group Presentation
Thursday, 25 September 2025 09.00 – 16.00 Group Discussion
Friday, 26 September 2025 09.00 – 12.00 Group Discussion
12.00 – 14.00 Lunch Break
14.00 – 15.00 Big Data for Official Statistics
15.00 – 16.00 Closing
Saturday,  27 September 2025 08.00 – 17.00 Sociocultural Activity

Outputs and Outcomes

The short course generated several important outputs. Participants gained practical skills in SQL, Python, and PySpark for big data analytics, along with a solid understanding of MPD methodologies such as data preprocessing, trip identification, stop-spot analysis, and the production of tourism-related indicators. In addition, each country team developed feasibility assessments that mapped opportunities, identified challenges, and proposed possible cooperation mechanisms with mobile network operators in their respective contexts. These outputs provided concrete deliverables that participants could take back to their institutions for immediate use.

In terms of long-term outcomes, the program is expected to strengthen the readiness of partner countries to integrate MPD into their official statistical systems. It also laid the groundwork for future pilot projects in MPD-based tourism statistics while enhancing regional collaboration and building a stronger community of practice. Through this shared experience, participating countries are better positioned to modernize their statistical processes and contribute to the advancement of official statistics across Asia and the Pacific.

Acknowledgment

This short course was made possible through the generous support of Indonesia AID as the funding partner, the commitment of BPS–Statistics Indonesia, and the dedication of the Regional Hub Secretariat at Politeknik Statistika STIS.

Special thanks are extended to the trainers and facilitators from BPS and STIS, who not only delivered technical sessions but also mentored participants throughout the exercises. Most importantly, the organizers express gratitude to all participants for their active engagement, enthusiasm, and commitment over the two-week program.