0 Votes

Wiki source code of Friday 14 February 2025

Version 6.7 by dennis yoshikawa on 2025/02/14 01:20

Show last authors
1 = Weekly Report 14 February 2025 =
2
3
4 == (% id="cke_bm_29973S" style="display:none" %) (%%)Data Engineer ==
5
6 >What Have You Done in This Week?
7
8 (((
9 {{success}}
10 What Have You Done?
11 {{/success}}
12
13 * Airflow-airbyte maintenance
14
15 ~1. create task for delete gcs folder table dharmadexa
16 2. ⁠change incremental method in airbyte for tb)
17
18 * Dkonsul
19
20 ~1. Deploy transformation code to production.
21 2. Final QC with partnership team. 
22 3. Dataset migrate dataset Herbawell Dashboard, BRI Medika DKONSUL
23
24 * ⁠*D2D*
25
26 ~1. enhance logic stickness rate (mart)
27 2. ⁠weekly regrup w/product
28
29 * MCN
30
31 ~1. troubleshoot on DAG and resource
32 2. ⁠repointing new database for all pipeline existing
33
34 * Screening
35
36 ~1. regroup w/PM and DA for validate data and discuss funneling data
37 2. enhance logic fact table dharmadexa
38 3. ⁠preparation dharmadexa data phase 3 (funneling)
39 4. ⁠testing geo.py to get city from long lat
40 5. ⁠migrasi dataset cleaned_screening_merck.dim_screening_stunting to CH
41
42 {{warning}}
43 What Issues You Have?
44 {{/warning}}
45
46 1. That's difficulty in finding open source data quality tools is that most of them are paid.
47
48 {{info}}
49 What Next You Will Do? (Optional)
50 {{/info}}
51
52 1. Explore more for finding open source data quality tools
53 1. Tuning query for reduce cost bigquery
54
55 **What Support You Need? (Optional)**
56
57
58 ----
59
60
61 )))
62
63
64 == (% id="cke_bm_29973S" style="display:none" %) (%%)Data Analyst ==
65
66 >What Have You Done in This Week?
67
68 (((
69 {{success}}
70 What Have You Done?
71 {{/success}}
72
73 1. **Dharma Dexa Phase 2** – Completed the second phase of Dharma Dexa.
74 1. **Migration from Looker to Metabase** – Transitioning data visualization and analytics from Looker to Metabase.
75 1. **Redesigned DKonsul Data** – Improved the structure and organization of DKonsul data.
76 1. **Ad-hoc Requests** – Handled various on-demand data requests.
77
78 {{warning}}
79 What Issues You Have?
80 {{/warning}}
81
82 1. **Metabase Limitations** – Limited chart options and flexibility in customization, particularly a lack of aggregation functions.
83
84 {{info}}
85 What Next You Will Do? (Optional)
86 {{/info}}
87
88 1. **Continue Redesigning DKonsul Data**
89 1*. Refining the data funnel from consultation → prescription → transaction.
90 1. **Continue Migration to Metabase** – Ensuring a smooth transition from Looker to Metabase.
91 1. **Dharma Dexa Phase 3** – Proceeding with the next phase of the Dharma Dexa project.
92 1. **AppSheet MCN Visit Tracker Dashboard** – Developing and optimizing the dashboard.
93
94 **What Support You Need? (Optional)**
95
96 1. **Data Validation** – Ensuring data accuracy and consistency.
97 1. **Dharma Dexa Screening Enhancements**
98 1*. Assigning a **new screening ID** for each event, especially if different questions and inputs are involved.
99 1*. Adding **location input (province, city)** for better analysis.
100 )))
101
102
103 ----
104
105 == Data Analyst & AI ==
106
107 >What Have You Done in This Week?
108
109 (((
110 {{success}}
111 What Have You Done?
112 {{/success}}
113
114 **AUTOMARK**
115
116 1. Deploy, evaluate and making documentation of **Screening Dharma Dexa API** to be accessed by Tech team. The API is deployed at [[https:~~/~~/datalake.ptgue.com/v1/users_dd>>https://datalake.ptgue.com/v1/users_dd]] and the **documentation is already sent to Tech team**. **The API could be accessed using prompt** to retrieve user that relevant to the prompt. Per 13 Feb 2024, **the RAG Accuracy is 87,50%**
117 1. Deploy, evaluate and making documentation of **master user GUE Ecosystem** to be accessed by Tech team. The API is deployed at [[https:~~/~~/datalake.ptgue.com/v1/users>>https://datalake.ptgue.com/v1/users_dd]] and the **documentation is already sent to Tech team**. **The API is already tested by Tech team and have no issues.**
118
119 **MCN**
120
121 1. Fixing resource spike issue caused by scraping schedule. The issue is already fixed and pass the test for 3 days (wednesday, thursday, and friday). **Scraping time is decreased from 1 minute per user to the maximum of +-20 seconds per user.** **(Pairing with Syifa-DE)**
122 1. **Repoint, re-align, and redesign the pipeline and database** that being consumed for scrapping.** (Pairing with Syifa-DE)**
123
124 **Automation**
125
126 1. **Implement compliance mapping** with the newest data from compliance team
127 1. Data, flow, and script validation for user report performance doctor. **The sample report is already validated by dr.Astrid**
128
129 **Others**
130
131 1. Reqeust + enhance dashboard merck
132 1. Request dkonsul data for online doctors, transactions, prescriptions, Dexa prescriptions, comparison new users january
133 1. Request data ICD-10
134
135 {{warning}}
136 What Issues You Have?
137 {{/warning}}
138
139 1. There is no info regarding database repointing of MCN from tech team. So, the data is not updated. Already solved by coordinating with Product Team
140 1. Need to enhance the RAG accuracy to around 90%
141
142 {{info}}
143 What Next You Will Do? (Optional)
144 {{/info}}
145
146 1. Increase RAG accuracy by adding more train query LLM
147 1. Start daily recurring scraping for tiktok dashboard
148 1. Start dkonsul insight next week
149
150 **What Support You Need? (Optional)**
151
152
153 === Summary ===
154
155 Berikut ringkasan laporan mingguan dari 14 Februari 2025 yang ditulis oleh Haekal Yusril Faizin. Laporan ini merangkum aktivitas dan isu dari tim Data Engineer, Data Analyst, dan Data Analyst & AI.
156
157 **Data Analyst**
158
159 * Menyelesaikan fase kedua Dharma Dexa.
160 * Memindahkan visualisasi data dan analitik dari Looker ke Metabase.
161 * Mendesain ulang data DKonsul untuk struktur dan organisasi yang lebih baik.
162 * Menangani berbagai permintaan data on-demand.
163
164 **Data Analyst & AI**
165
166 * Menerapkan, mengevaluasi, dan mendokumentasikan API Screening Dharma Dexa untuk diakses oleh tim Tech.
167 * Menerapkan, mengevaluasi, dan mendokumentasikan master user GUE Ecosystem untuk diakses oleh tim Tech.
168 * Memperbaiki masalah lonjakan sumber daya yang disebabkan oleh jadwal scraping.
169 * Merepoint, menyelaraskan ulang, dan mendesain ulang pipeline dan database yang digunakan untuk scraping.
170 * Menerapkan pemetaan kepatuhan dengan data terbaru dari tim kepatuhan.
171 * Memvalidasi data, alur, dan skrip untuk kinerja laporan pengguna dokter.
172
173 **Isu**
174
175 * Pilihan bagan dan fleksibilitas yang terbatas dalam kustomisasi Metabase.
176 * Kurangnya informasi mengenai database repointing MCN dari tim teknologi.
177 * Perlu meningkatkan akurasi RAG menjadi sekitar 90%.
178
179 **Langkah Selanjutnya**
180
181 * Melanjutkan desain ulang data DKonsul dan migrasi ke Metabase.
182 * Melanjutkan dengan Dharma Dexa Fase 3 dan mengembangkan Dasbor Pelacak Kunjungan AppSheet MCN.
183 * Meningkatkan akurasi RAG, memulai scraping berulang harian untuk dasbor TikTok, dan memulai wawasan DKonsul minggu depan.
184
185 **Dukungan yang Dibutuhkan**
186
187 * Validasi data dan peningkatan penyaringan Dharma Dexa.
188 )))