Mei 07, 2026
Daniel Johnson
Anatomi skills.md (Standar Komunitas)
🔍 1. Anatomi skills.md (Standar Komunitas)
File ini biasanya memiliki struktur:
---
name: "Data Analyst"
version: "1.2"
role: "Spesialis analisis data dengan Python, pandas, dan visualisasi"
priority: 1
---
## 🧠 Core Skills
- Pembersihan & transformasi data tabular
- Analisis statistik deskriptif & inferensial
- Visualisasi dengan Plotly/Matplotlib
- Optimasi query pandas untuk dataset >100MB
## 🔄 Workflow
1. Validasi skema data & jenis missing values
2. Terapkan imputasi/transformasi sesuai konteks bisnis
3. Hasilkan kode + penjelasan singkat per langkah
4. Sertakan peringatan jika asumsi tidak terpenuhi
## 🚫 Constraints
- Selalu gunakan type hints & docstring
- Jangan asumsikan kolom tanpa konfirmasi
- Output Python dalam blok kode dengan bahasa `python`
- Hindari library deprecated (misal: `df.append`)
## 📎 Examples
<example>
User: "Bersihkan kolom 'age' yang ada NaN & outlier"
Assistant:
```python
import pandas as pd
import numpy as np
def clean_age(df: pd.DataFrame) -> pd.DataFrame:
df['age'] = df['age'].replace([np.inf, -np.inf], np.nan)
df['age'].fillna(df['age'].median(), inplace=True)
# IQR capping
Q1, Q3 = df['age'].quantile([0.25, 0.75])
IQR = Q3 - Q1
df.loc[df['age'] < Q1 - 1.5*IQR, 'age'] = Q1 - 1.5*IQR
df.loc[df['age'] > Q3 + 1.5*IQR, 'age'] = Q3 + 1.5*IQR
return df
```
🏗️ 2. Arsitektur Implementasi di Qwen Studio
[Frontend UI] → Upload/Edit `skills.md` → [Backend Parser] → Validasi Schema
↓
[Prompt Assembler] → Gabungkan: System Base + Skills + Context → [Qwen API Client]
↓
[Storage Layer] → Simpan per Project/User → Versioning & Toggle On/Off
Komponen Kunci:
| Layer | Teknologi/Approach |
|---|---|
| Parser | python-frontmatter + pydantic + markdown-it-py |
| Validasi | JSON Schema / Pydantic model |
| Prompt Assembly | Template engine + token budgeting |
| Storage | PostgreSQL/SQLite (JSONB untuk skills) + Redis (cache aktif) |
| Integrasi Qwen | dashscope SDK atau REST API (role: "system") |
💻 3. Implementasi Minimal (Python)
# skills_parser.py
import yaml
import re
from pydantic import BaseModel, Field, validator
from typing import List
class SkillsSchema(BaseModel):
name: str
version: str
role: str
priority: int = Field(ge=0, le=10, default=5)
core_skills: List[str]
workflow: List[str]
constraints: List[str]
examples: str = ""
@validator('core_skills', 'workflow', 'constraints', pre=True)
def parse_list(cls, v):
return [line.strip().lstrip('- ') for line in v.split('\n') if line.strip()]
def parse_skills_md(md_text: str) -> SkillsSchema:
# Extract frontmatter
match = re.match(r'^---\n(.*?)\n---\s*(.*)', md_text, re.DOTALL)
if not match:
raise ValueError("Format skills.md tidak valid: hilang frontmatter YAML")
frontmatter_yaml, body = match.groups()
data = yaml.safe_load(frontmatter_yaml)
# Extract sections
sections = {}
for line in body.split('\n'):
if line.startswith('## '):
current_section = line.strip('## ').lower().replace(' ', '_')
sections[current_section] = []
elif current_section in sections:
sections[current_section].append(line.strip())
# Map to schema
parsed = {
"name": data.get("name", "Default"),
"version": data.get("version", "1.0"),
"role": data.get("role", ""),
"priority": data.get("priority", 5),
"core_skills": sections.get("core_skills", []),
"workflow": sections.get("workflow", []),
"constraints": sections.get("constraints", []),
"examples": body # raw examples for reference
}
return SkillsSchema(**parsed)
def assemble_system_prompt(skills: SkillsSchema, base_prompt: str = "") -> str:
prompt_parts = [base_prompt.strip()] if base_prompt else []
prompt_parts.append(f"### 🎭 PERAN\n{skills.role}")
prompt_parts.append(f"### 🧩 KAPABILITAS UTAMA\n" + "\n".join(f"- {s}" for s in skills.core_skills))
prompt_parts.append(f"### 🔄 ALUR KERJA\n" + "\n".join(f"{i+1}. {s}" for i, s in enumerate(skills.workflow)))
prompt_parts.append(f"### 🚫 BATASAN\n" + "\n".join(f"- {c}" for c in skills.constraints))
if skills.examples:
prompt_parts.append(f"### 📎 CONTOH PANDUAN\n{skills.examples}")
return "\n\n".join(p.strip() for p in prompt_parts if p.strip())
Integrasi ke Qwen API:
from dashscope import Generation
def call_qwen_with_skills(user_input: str, skills_md: str):
skills = parse_skills_md(skills_md)
system_prompt = assemble_system_prompt(skills)
response = Generation.call(
model="qwen-max",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_input}
],
temperature=0.3,
max_tokens=2048
)
return response.output.text
🛡️ 4. Pertimbangan Produksi (Critical)
| Aspek | Solusi |
|---|---|
| Token Limit | Gunakan sliding window, prioritaskan constraints & workflow, kompres examples jika >2k token |
| Prompt Injection | Sanitasi MD: hapus tag HTML, batasi panjang, block {{}}, validasi YAML strict |
| Multi-Skills | Urutkan berdasarkan priority, gabungkan dengan separator ===, deduplikasi aturan konflik |
| Versioning | Simpan hash MD5 per versi, rollback UI, audit trail |
| Qwen Studio Integration | Manfaatkan fitur Custom System Prompt atau Project Config jika tersedia. Jika tidak, buat middleware API wrapper |
📅 5. Roadmap Implementasi di Qwen Studio
- Prototype (1-2 minggu): Parser + Validator + CLI test dengan
dashscope - UI Integration (2 minggu): Editor Monaco Markdown, preview prompt, toggle on/off per project
- Storage & Versioning (1 minggu): DB schema, Git-like version diff, restore point
- Optimization (1 minggu): Token budgeting, async validation, caching skills aktif
- Security & Audit (1 minggu): Sanitizer pipeline, prompt injection test suite, rate limiting
- Deployment & Docs: Internal wiki, user guide, contoh template per domain (coding, research, design, dll)
🔗 Referensi Teknis
- Qwen API Docs: https://help.aliyun.com/zh/model-studio/developer-reference/api-reference
- Dashscope Python SDK:
pip install dashscope - Prompt Engineering Best Practices: https://platform.openai.com/docs/guides/prompt-engineering (pola umum berlaku untuk Qwen)
- Pydantic + YAML parsing: https://docs.pydantic.dev/latest/