performing-api-inventory-and-discovery

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Performing API Inventory and Discovery

执行API盘点与发现

When to Use

适用场景

  • Mapping the complete API attack surface of an organization before a security assessment
  • Identifying shadow APIs deployed by development teams without security review
  • Discovering deprecated or zombie API versions that remain accessible but unmaintained
  • Finding undocumented API endpoints exposed through mobile applications, SPAs, or microservices
  • Building an API inventory for compliance requirements (PCI-DSS, SOC2, GDPR)
Do not use without written authorization. API discovery involves scanning network infrastructure and analyzing traffic.
  • 在安全评估前映射组织的完整API攻击面
  • 识别开发团队未经安全审查部署的影子API
  • 发现仍可访问但未维护的已弃用或僵尸API版本
  • 查找通过移动应用、SPA或微服务暴露的未文档化API端点
  • 为合规要求(PCI-DSS、SOC2、GDPR)构建API库存
禁止使用:未经书面授权不得使用。API发现涉及扫描网络基础设施和分析流量。

Prerequisites

前置条件

  • Written authorization specifying the target domains and network ranges
  • Passive traffic capture capability (network tap, proxy, or cloud traffic mirroring)
  • Active scanning tools: Amass, subfinder, httpx, and nuclei
  • JavaScript analysis tools: LinkFinder, JS-Miner, or custom parsers
  • Access to cloud console (AWS, Azure, GCP) for API gateway inventory
  • Burp Suite Professional for passive API endpoint discovery
  • 指定目标域名和网络范围的书面授权
  • 被动流量捕获能力(网络分流器、代理或云流量镜像)
  • 主动扫描工具:Amass、subfinder、httpx和nuclei
  • JavaScript分析工具:LinkFinder、JS-Miner或自定义解析器
  • 访问云控制台(AWS、Azure、GCP)以进行API网关盘点
  • Burp Suite Professional用于被动API端点发现

Workflow

工作流程

Step 1: Passive API Discovery from Traffic Analysis

步骤1:通过流量分析进行被动API发现

python
import re
import json
from collections import defaultdict
python
import re
import json
from collections import defaultdict

Parse HAR file from browser developer tools or proxy

Parse HAR file from browser developer tools or proxy

def analyze_har_for_apis(har_file_path): """Extract API endpoints from HTTP Archive (HAR) file.""" with open(har_file_path) as f: har = json.load(f)
api_endpoints = defaultdict(lambda: {
    "methods": set(), "content_types": set(),
    "auth_types": set(), "count": 0
})

for entry in har["log"]["entries"]:
    url = entry["request"]["url"]
    method = entry["request"]["method"]

    # Identify API patterns
    api_patterns = [
        r'/api/', r'/v\d+/', r'/graphql', r'/rest/',
        r'/ws/', r'/rpc/', r'/grpc', r'/json',
    ]

    if any(re.search(p, url) for p in api_patterns):
        # Normalize the URL (remove query params and IDs)
        normalized = re.sub(r'\?.*$', '', url)
        normalized = re.sub(r'/\d+(/|$)', '/{id}\\1', normalized)
        normalized = re.sub(
            r'/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}',
            '/{uuid}', normalized)

        ep = api_endpoints[normalized]
        ep["methods"].add(method)
        ep["count"] += 1

        # Detect authentication type
        for header in entry["request"]["headers"]:
            name = header["name"].lower()
            if name == "authorization":
                if "bearer" in header["value"].lower():
                    ep["auth_types"].add("Bearer/JWT")
                elif "basic" in header["value"].lower():
                    ep["auth_types"].add("Basic")
            elif name == "x-api-key":
                ep["auth_types"].add("API Key")

        # Detect content type
        content_type = next(
            (h["value"] for h in entry["request"]["headers"]
             if h["name"].lower() == "content-type"), None)
        if content_type:
            ep["content_types"].add(content_type.split(";")[0])

print(f"Discovered {len(api_endpoints)} unique API endpoints:\n")
for url, info in sorted(api_endpoints.items()):
    methods = ", ".join(sorted(info["methods"]))
    auth = ", ".join(info["auth_types"]) or "None"
    print(f"  [{methods}] {url}")
    print(f"    Auth: {auth} | Requests: {info['count']}")

return api_endpoints
undefined
def analyze_har_for_apis(har_file_path): """Extract API endpoints from HTTP Archive (HAR) file.""" with open(har_file_path) as f: har = json.load(f)
api_endpoints = defaultdict(lambda: {
    "methods": set(), "content_types": set(),
    "auth_types": set(), "count": 0
})

for entry in har["log"]["entries"]:
    url = entry["request"]["url"]
    method = entry["request"]["method"]

    # Identify API patterns
    api_patterns = [
        r'/api/', r'/v\d+/', r'/graphql', r'/rest/',
        r'/ws/', r'/rpc/', r'/grpc', r'/json',
    ]

    if any(re.search(p, url) for p in api_patterns):
        # Normalize the URL (remove query params and IDs)
        normalized = re.sub(r'\?.*$', '', url)
        normalized = re.sub(r'/\d+(/|$)', '/{id}\\1', normalized)
        normalized = re.sub(
            r'/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}',
            '/{uuid}', normalized)

        ep = api_endpoints[normalized]
        ep["methods"].add(method)
        ep["count"] += 1

        # Detect authentication type
        for header in entry["request"]["headers"]:
            name = header["name"].lower()
            if name == "authorization":
                if "bearer" in header["value"].lower():
                    ep["auth_types"].add("Bearer/JWT")
                elif "basic" in header["value"].lower():
                    ep["auth_types"].add("Basic")
            elif name == "x-api-key":
                ep["auth_types"].add("API Key")

        # Detect content type
        content_type = next(
            (h["value"] for h in entry["request"]["headers"]
             if h["name"].lower() == "content-type"), None)
        if content_type:
            ep["content_types"].add(content_type.split(";")[0])

print(f"Discovered {len(api_endpoints)} unique API endpoints:\n")
for url, info in sorted(api_endpoints.items()):
    methods = ", ".join(sorted(info["methods"]))
    auth = ", ".join(info["auth_types"]) or "None"
    print(f"  [{methods}] {url}")
    print(f"    Auth: {auth} | Requests: {info['count']}")

return api_endpoints
undefined

Step 2: Active API Endpoint Discovery

步骤2:主动API端点发现

bash
undefined
bash
undefined

DNS enumeration for API subdomains

DNS enumeration for API subdomains

amass enum -d example.com -o amass_results.txt subfinder -d example.com -o subfinder_results.txt
amass enum -d example.com -o amass_results.txt subfinder -d example.com -o subfinder_results.txt

Filter for API-related subdomains

Filter for API-related subdomains

grep -iE '(api|rest|graphql|ws|gateway|backend|internal|staging|dev|v1|v2)'
amass_results.txt subfinder_results.txt | sort -u > api_subdomains.txt
grep -iE '(api|rest|graphql|ws|gateway|backend|internal|staging|dev|v1|v2)'
amass_results.txt subfinder_results.txt | sort -u > api_subdomains.txt

Check which subdomains are alive

Check which subdomains are alive

cat api_subdomains.txt | httpx -status-code -content-length -title
-tech-detect -o live_apis.txt
cat api_subdomains.txt | httpx -status-code -content-length -title
-tech-detect -o live_apis.txt

Probe common API paths on each live subdomain

Probe common API paths on each live subdomain

cat api_subdomains.txt | while read domain; do for path in /api /api/v1 /api/v2 /graphql /swagger.json /openapi.json
/api-docs /docs /health /status /metrics /actuator; do curl -s -o /dev/null -w "%{http_code} %{url_effective}\n"
"https://${domain}${path}" 2>/dev/null | grep -v "^404" done done

```python
import requests
import concurrent.futures

def discover_api_endpoints(base_domains):
    """Actively probe for API endpoints across discovered domains."""

    # Common API paths to test
    API_PATHS = [
        "/api", "/api/v1", "/api/v2", "/api/v3",
        "/graphql", "/gql", "/query",
        "/rest", "/json", "/rpc",
        "/swagger.json", "/swagger/v1/swagger.json",
        "/openapi.json", "/openapi.yaml", "/api-docs",
        "/docs", "/redoc", "/explorer",
        "/.well-known/openid-configuration",
        "/health", "/healthz", "/ready",
        "/status", "/info", "/version",
        "/metrics", "/prometheus",
        "/actuator", "/actuator/health", "/actuator/info",
        "/admin", "/admin/api", "/internal",
        "/debug", "/debug/vars", "/debug/pprof",
        "/ws", "/websocket", "/socket.io",
        "/grpc", "/twirp",
    ]

    discovered = []

    def check_endpoint(domain, path):
        for scheme in ["https", "http"]:
            url = f"{scheme}://{domain}{path}"
            try:
                resp = requests.get(url, timeout=5, allow_redirects=False,
                                  verify=False)  # TLS verification disabled for discovery; enable in production
                if resp.status_code not in (404, 502, 503):
                    return {
                        "url": url,
                        "status": resp.status_code,
                        "content_type": resp.headers.get("Content-Type", ""),
                        "server": resp.headers.get("Server", ""),
                        "size": len(resp.content),
                    }
            except requests.exceptions.RequestException:
                pass
        return None

    with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
        futures = {}
        for domain in base_domains:
            for path in API_PATHS:
                future = executor.submit(check_endpoint, domain, path)
                futures[future] = (domain, path)

        for future in concurrent.futures.as_completed(futures):
            result = future.result()
            if result:
                discovered.append(result)
                print(f"  [FOUND] {result['url']} -> {result['status']} ({result['content_type']})")

    return discovered
cat api_subdomains.txt | while read domain; do for path in /api /api/v1 /api/v2 /graphql /swagger.json /openapi.json
/api-docs /docs /health /status /metrics /actuator; do curl -s -o /dev/null -w "%{http_code} %{url_effective}\n"
"https://${domain}${path}" 2>/dev/null | grep -v "^404" done done

```python
import requests
import concurrent.futures

def discover_api_endpoints(base_domains):
    """Actively probe for API endpoints across discovered domains."""

    # Common API paths to test
    API_PATHS = [
        "/api", "/api/v1", "/api/v2", "/api/v3",
        "/graphql", "/gql", "/query",
        "/rest", "/json", "/rpc",
        "/swagger.json", "/swagger/v1/swagger.json",
        "/openapi.json", "/openapi.yaml", "/api-docs",
        "/docs", "/redoc", "/explorer",
        "/.well-known/openid-configuration",
        "/health", "/healthz", "/ready",
        "/status", "/info", "/version",
        "/metrics", "/prometheus",
        "/actuator", "/actuator/health", "/actuator/info",
        "/admin", "/admin/api", "/internal",
        "/debug", "/debug/vars", "/debug/pprof",
        "/ws", "/websocket", "/socket.io",
        "/grpc", "/twirp",
    ]

    discovered = []

    def check_endpoint(domain, path):
        for scheme in ["https", "http"]:
            url = f"{scheme}://{domain}{path}"
            try:
                resp = requests.get(url, timeout=5, allow_redirects=False,
                                  verify=False)  # TLS verification disabled for discovery; enable in production
                if resp.status_code not in (404, 502, 503):
                    return {
                        "url": url,
                        "status": resp.status_code,
                        "content_type": resp.headers.get("Content-Type", ""),
                        "server": resp.headers.get("Server", ""),
                        "size": len(resp.content),
                    }
            except requests.exceptions.RequestException:
                pass
        return None

    with concurrent.futures.ThreadPoolExecutor(max_workers=20) as executor:
        futures = {}
        for domain in base_domains:
            for path in API_PATHS:
                future = executor.submit(check_endpoint, domain, path)
                futures[future] = (domain, path)

        for future in concurrent.futures.as_completed(futures):
            result = future.result()
            if result:
                discovered.append(result)
                print(f"  [FOUND] {result['url']} -> {result['status']} ({result['content_type']})")

    return discovered

Step 3: JavaScript Source Analysis for API Endpoints

步骤3:从JavaScript源码中提取API端点

python
import re
import requests

def extract_apis_from_javascript(js_urls):
    """Extract API endpoints from JavaScript source files."""
    api_pattern = re.compile(
        r'''(?:['"`])((?:/api/|/v[0-9]+/|/graphql|/rest/)[^'"`\s<>{}]+)(?:['"`])''',
        re.IGNORECASE
    )
    url_pattern = re.compile(
        r'''(?:['"`])(https?://[a-zA-Z0-9._-]+(?:\.[a-zA-Z]{2,})+(?:/[^'"`\s<>{}]*)?)(?:['"`])'''
    )
    fetch_pattern = re.compile(
        r'''(?:fetch|axios|ajax|XMLHttpRequest|\.get|\.post|\.put|\.delete|\.patch)\s*\(\s*(?:['"`])([^'"`]+)'''
    )

    all_endpoints = set()

    for js_url in js_urls:
        try:
            resp = requests.get(js_url, timeout=10)
            content = resp.text

            # Extract relative API paths
            for match in api_pattern.findall(content):
                all_endpoints.add(("relative", match))

            # Extract absolute URLs
            for match in url_pattern.findall(content):
                if any(kw in match.lower() for kw in ["/api", "/v1", "/v2", "graphql"]):
                    all_endpoints.add(("absolute", match))

            # Extract from fetch/axios calls
            for match in fetch_pattern.findall(content):
                all_endpoints.add(("fetch", match))

        except requests.exceptions.RequestException:
            pass

    print(f"\nAPI endpoints discovered from JavaScript ({len(all_endpoints)}):")
    for source, endpoint in sorted(all_endpoints):
        print(f"  [{source}] {endpoint}")

    return all_endpoints
python
import re
import requests

def extract_apis_from_javascript(js_urls):
    """Extract API endpoints from JavaScript source files."""
    api_pattern = re.compile(
        r'''(?:['"`])((?:/api/|/v[0-9]+/|/graphql|/rest/)[^'"`\s<>{}]+)(?:['"`])''',
        re.IGNORECASE
    )
    url_pattern = re.compile(
        r'''(?:['"`])(https?://[a-zA-Z0-9._-]+(?:\.[a-zA-Z]{2,})+(?:/[^'"`\s<>{}]*)?)(?:['"`])'''
    )
    fetch_pattern = re.compile(
        r'''(?:fetch|axios|ajax|XMLHttpRequest|\.get|\.post|\.put|\.delete|\.patch)\s*\(\s*(?:['"`])([^'"`]+)'''
    )

    all_endpoints = set()

    for js_url in js_urls:
        try:
            resp = requests.get(js_url, timeout=10)
            content = resp.text

            # Extract relative API paths
            for match in api_pattern.findall(content):
                all_endpoints.add(("relative", match))

            # Extract absolute URLs
            for match in url_pattern.findall(content):
                if any(kw in match.lower() for kw in ["/api", "/v1", "/v2", "graphql"]):
                    all_endpoints.add(("absolute", match))

            # Extract from fetch/axios calls
            for match in fetch_pattern.findall(content):
                all_endpoints.add(("fetch", match))

        except requests.exceptions.RequestException:
            pass

    print(f"\nAPI endpoints discovered from JavaScript ({len(all_endpoints)}):")
    for source, endpoint in sorted(all_endpoints):
        print(f"  [{source}] {endpoint}")

    return all_endpoints

Find JavaScript files from the target domain

Find JavaScript files from the target domain

def find_js_files(domain): """Discover JavaScript files from a web application.""" resp = requests.get(f"https://{domain}", timeout=10) js_files = re.findall(r'src="'', resp.text) full_urls = [] for js in js_files: if js.startswith("http"): full_urls.append(js) elif js.startswith("//"): full_urls.append(f"https:{js}") elif js.startswith("/"): full_urls.append(f"https://{domain}{js}") return full_urls
undefined
def find_js_files(domain): """Discover JavaScript files from a web application.""" resp = requests.get(f"https://{domain}", timeout=10) js_files = re.findall(r'src="'', resp.text) full_urls = [] for js in js_files: if js.startswith("http"): full_urls.append(js) elif js.startswith("//"): full_urls.append(f"https:{js}") elif js.startswith("/"): full_urls.append(f"https://{domain}{js}") return full_urls
undefined

Step 4: Cloud API Gateway Inventory

步骤4:云API网关盘点

python
import boto3

def inventory_aws_apis():
    """Inventory all APIs in AWS API Gateway."""
    apigw = boto3.client('apigateway')
    apigwv2 = boto3.client('apigatewayv2')

    apis = []

    # REST APIs (API Gateway v1)
    rest_apis = apigw.get_rest_apis()
    for api in rest_apis['items']:
        resources = apigw.get_resources(restApiId=api['id'])
        stages = apigw.get_stages(restApiId=api['id'])

        for stage in stages['item']:
            for resource in resources['items']:
                for method in resource.get('resourceMethods', {}).keys():
                    apis.append({
                        "type": "REST",
                        "name": api['name'],
                        "stage": stage['stageName'],
                        "path": resource['path'],
                        "method": method,
                        "url": f"https://{api['id']}.execute-api.{boto3.session.Session().region_name}.amazonaws.com/{stage['stageName']}{resource['path']}",
                        "created": str(api.get('createdDate', '')),
                    })

    # HTTP APIs (API Gateway v2)
    http_apis = apigwv2.get_apis()
    for api in http_apis['Items']:
        routes = apigwv2.get_routes(ApiId=api['ApiId'])
        stages = apigwv2.get_stages(ApiId=api['ApiId'])

        for route in routes['Items']:
            apis.append({
                "type": "HTTP",
                "name": api['Name'],
                "route": route['RouteKey'],
                "api_id": api['ApiId'],
                "protocol": api['ProtocolType'],
            })

    print(f"\nAWS API Inventory ({len(apis)} endpoints):")
    for api in apis:
        print(f"  [{api['type']}] {api.get('name')} - {api.get('method', '')} {api.get('path', api.get('route', ''))}")

    return apis
python
import boto3

def inventory_aws_apis():
    """Inventory all APIs in AWS API Gateway."""
    apigw = boto3.client('apigateway')
    apigwv2 = boto3.client('apigatewayv2')

    apis = []

    # REST APIs (API Gateway v1)
    rest_apis = apigw.get_rest_apis()
    for api in rest_apis['items']:
        resources = apigw.get_resources(restApiId=api['id'])
        stages = apigw.get_stages(restApiId=api['id'])

        for stage in stages['item']:
            for resource in resources['items']:
                for method in resource.get('resourceMethods', {}).keys():
                    apis.append({
                        "type": "REST",
                        "name": api['name'],
                        "stage": stage['stageName'],
                        "path": resource['path'],
                        "method": method,
                        "url": f"https://{api['id']}.execute-api.{boto3.session.Session().region_name}.amazonaws.com/{stage['stageName']}{resource['path']}",
                        "created": str(api.get('createdDate', '')),
                    })

    # HTTP APIs (API Gateway v2)
    http_apis = apigwv2.get_apis()
    for api in http_apis['Items']:
        routes = apigwv2.get_routes(ApiId=api['ApiId'])
        stages = apigwv2.get_stages(ApiId=api['ApiId'])

        for route in routes['Items']:
            apis.append({
                "type": "HTTP",
                "name": api['Name'],
                "route": route['RouteKey'],
                "api_id": api['ApiId'],
                "protocol": api['ProtocolType'],
            })

    print(f"\nAWS API Inventory ({len(apis)} endpoints):")
    for api in apis:
        print(f"  [{api['type']}] {api.get('name')} - {api.get('method', '')} {api.get('path', api.get('route', ''))}")

    return apis

Step 5: API Version and Shadow API Detection

步骤5:API版本与影子API检测

python
def detect_shadow_and_zombie_apis(discovered_endpoints, documented_endpoints):
    """Compare discovered APIs against documented inventory."""

    # Normalize endpoints for comparison
    def normalize(ep):
        ep = re.sub(r'/v\d+/', '/vX/', ep)
        ep = re.sub(r'/\d+', '/{id}', ep)
        return ep.lower().rstrip('/')

    documented_normalized = {normalize(ep) for ep in documented_endpoints}

    shadow_apis = []  # Discovered but not documented
    zombie_apis = []  # Old versions still accessible

    for ep in discovered_endpoints:
        normalized = normalize(ep["url"])

        if normalized not in documented_normalized:
            # Check if it is an old version of a documented API
            if re.search(r'/v[0-9]+/', ep["url"]):
                zombie_apis.append(ep)
            else:
                shadow_apis.append(ep)

    print(f"\nShadow APIs (undocumented): {len(shadow_apis)}")
    for api in shadow_apis:
        print(f"  [SHADOW] {api['url']} -> {api['status']}")

    print(f"\nZombie APIs (deprecated versions): {len(zombie_apis)}")
    for api in zombie_apis:
        print(f"  [ZOMBIE] {api['url']} -> {api['status']}")

    # Check if zombie APIs lack security controls
    for api in zombie_apis:
        resp = requests.get(api["url"], timeout=5)
        if resp.status_code not in (401, 403):
            print(f"  [CRITICAL] Zombie API accessible without auth: {api['url']}")

    return shadow_apis, zombie_apis
python
def detect_shadow_and_zombie_apis(discovered_endpoints, documented_endpoints):
    """Compare discovered APIs against documented inventory."""

    # Normalize endpoints for comparison
    def normalize(ep):
        ep = re.sub(r'/v\d+/', '/vX/', ep)
        ep = re.sub(r'/\d+', '/{id}', ep)
        return ep.lower().rstrip('/')

    documented_normalized = {normalize(ep) for ep in documented_endpoints}

    shadow_apis = []  # Discovered but not documented
    zombie_apis = []  # Old versions still accessible

    for ep in discovered_endpoints:
        normalized = normalize(ep["url"])

        if normalized not in documented_normalized:
            # Check if it is an old version of a documented API
            if re.search(r'/v[0-9]+/', ep["url"]):
                zombie_apis.append(ep)
            else:
                shadow_apis.append(ep)

    print(f"\nShadow APIs (undocumented): {len(shadow_apis)}")
    for api in shadow_apis:
        print(f"  [SHADOW] {api['url']} -> {api['status']}")

    print(f"\nZombie APIs (deprecated versions): {len(zombie_apis)}")
    for api in zombie_apis:
        print(f"  [ZOMBIE] {api['url']} -> {api['status']}")

    # Check if zombie APIs lack security controls
    for api in zombie_apis:
        resp = requests.get(api["url"], timeout=5)
        if resp.status_code not in (401, 403):
            print(f"  [CRITICAL] Zombie API accessible without auth: {api['url']}")

    return shadow_apis, zombie_apis

Key Concepts

核心概念

TermDefinition
Shadow APIAn API deployed by a development team without going through the official API management or security review process
Zombie APIA deprecated or old API version that remains accessible and running but is no longer maintained or monitored
API InventoryA comprehensive catalog of all APIs in an organization including endpoint URLs, owners, versions, authentication methods, and data classifications
Improper Inventory ManagementOWASP API9:2023 - failure to maintain an accurate API inventory, leading to unmonitored and unprotected API endpoints
Attack SurfaceThe total set of API endpoints, methods, and parameters that an attacker can potentially interact with
API SprawlThe uncontrolled proliferation of APIs in an organization, often resulting from microservice adoption without centralized governance
术语定义
影子API由开发团队部署,未经过官方API管理或安全审查流程的API
僵尸API已弃用或旧版本的API,仍可访问且在运行,但不再被维护或监控
API库存组织内所有API的全面目录,包括端点URL、所有者、版本、认证方法和数据分类
不当库存管理OWASP API9:2023 - 未能维护准确的API库存,导致API端点未被监控和保护
攻击面攻击者可能潜在交互的所有API端点、方法和参数的总和
API蔓延组织内API不受控制的激增,通常是由于采用微服务但缺乏集中治理导致

Tools & Systems

工具与系统

  • Amass: OWASP tool for attack surface mapping through DNS enumeration, web scraping, and API discovery
  • httpx: Fast HTTP probing tool for validating discovered domains and identifying live API endpoints
  • nuclei: Template-based scanner for detecting exposed API documentation, debug endpoints, and misconfigured services
  • Swagger UI Detector: Tool for finding exposed Swagger/OpenAPI documentation endpoints across the organization
  • Akto: API security platform that discovers APIs through traffic analysis and maintains an automated inventory
  • Amass: OWASP工具,通过DNS枚举、网页抓取和API发现进行攻击面映射
  • httpx: 快速HTTP探测工具,用于验证发现的域名并识别活跃API端点
  • nuclei: 基于模板的扫描器,用于检测暴露的API文档、调试端点和配置错误的服务
  • Swagger UI Detector: 用于在组织范围内查找暴露的Swagger/OpenAPI文档端点的工具
  • Akto: API安全平台,通过流量分析发现API并维护自动化库存

Common Scenarios

常见场景

Scenario: Enterprise API Attack Surface Assessment

场景:企业API攻击面评估

Context: A large enterprise has 200+ development teams using microservices. The security team suspects many undocumented APIs are exposed to the internet. A comprehensive API inventory is needed for a security audit.
Approach:
  1. DNS enumeration discovers 340 subdomains, 45 contain API-related keywords (api, rest, gateway, backend)
  2. Active probing of all subdomains with API path wordlist discovers 127 live API endpoints
  3. JavaScript analysis of the main web application reveals 34 API endpoints, 8 of which point to undocumented internal services
  4. AWS API Gateway inventory shows 67 REST APIs and 23 HTTP APIs across 12 accounts
  5. Cross-referencing against the official API catalog: 31 shadow APIs (undocumented), 14 zombie APIs (deprecated versions)
  6. 3 zombie APIs have no authentication, exposing customer data through endpoints that were supposed to be decommissioned
  7. 2 shadow APIs expose internal admin functions to the internet without authorization
Pitfalls:
  • Only checking documented API endpoints and missing shadow APIs deployed outside the API gateway
  • Not scanning JavaScript bundles where frontend applications hardcode API endpoint URLs
  • Missing APIs behind non-standard ports or subpaths
  • Not checking for multiple API versions where older versions may lack security controls
  • Assuming all APIs go through the API gateway when some may be directly exposed
背景:某大型企业有200多个开发团队使用微服务。安全团队怀疑有许多未文档化的API暴露在互联网上,需要构建全面的API库存以进行安全审计。
方法:
  1. DNS枚举发现340个子域名,其中45个包含API相关关键词(api、rest、gateway、backend)
  2. 使用API路径词表对所有子域名进行主动探测,发现127个活跃API端点
  3. 对主Web应用进行JavaScript分析,发现34个API端点,其中8个指向未文档化的内部服务
  4. AWS API网关盘点显示,12个账户中有67个REST API和23个HTTP API
  5. 与官方API目录交叉对比:31个影子API(未文档化),14个僵尸API(已弃用版本)
  6. 3个僵尸API无认证,暴露了本应停用的端点中的客户数据
  7. 2个影子API将内部管理功能暴露在互联网上,且无授权机制
常见误区:
  • 仅检查已文档化的API端点,忽略了在API网关外部署的影子API
  • 未扫描前端应用硬编码API端点URL的JavaScript包
  • 遗漏了非标准端口或子路径后的API
  • 未检查多个API版本,旧版本可能缺乏安全控制
  • 假设所有API都通过API网关,而有些可能直接暴露

Output Format

输出格式

undefined
undefined

API Inventory and Discovery Report

API Inventory and Discovery Report

Organization: Example Corp Assessment Date: 2024-12-15 Domains Scanned: 340
Organization: Example Corp Assessment Date: 2024-12-15 Domains Scanned: 340

Summary

Summary

CategoryCount
Total APIs Discovered127
Documented APIs82
Shadow APIs (undocumented)31
Zombie APIs (deprecated)14
APIs Without Authentication8
APIs Exposing Sensitive Data5
CategoryCount
Total APIs Discovered127
Documented APIs82
Shadow APIs (undocumented)31
Zombie APIs (deprecated)14
APIs Without Authentication8
APIs Exposing Sensitive Data5

Critical Findings

Critical Findings

  1. Zombie API: api-v1.example.com/api/v1/users - Deprecated in 2022, still accessible, no authentication required, returns full user data
  2. Shadow API: internal-tools.example.com/api/admin - Admin functions exposed to internet without authorization
  3. Exposed Documentation: 12 Swagger UI instances accessible publicly, revealing full API schema and endpoint details
undefined
  1. Zombie API: api-v1.example.com/api/v1/users - Deprecated in 2022, still accessible, no authentication required, returns full user data
  2. Shadow API: internal-tools.example.com/api/admin - Admin functions exposed to internet without authorization
  3. Exposed Documentation: 12 Swagger UI instances accessible publicly, revealing full API schema and endpoint details
undefined