content-access

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Content access methodology

内容访问方法

Ethical and legal approaches for accessing restricted web content for journalism and research.
面向新闻工作者和研究人员的受限网络内容合规合法访问方法。

Access hierarchy (most to least preferred)

访问优先级(从最推荐到最不推荐)

┌─────────────────────────────────────────────────────────────────┐
│              CONTENT ACCESS DECISION HIERARCHY                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. FULLY LEGAL (Always try first)                              │
│     ├─ Library databases (PressReader, ProQuest, JSTOR)         │
│     ├─ Open access tools (Unpaywall, CORE, PubMed Central)     │
│     ├─ Author direct contact                                    │
│     └─ Interlibrary loan                                        │
│                                                                  │
│  2. LEGAL (Browser features)                                    │
│     ├─ Reader Mode (Safari, Firefox, Edge)                      │
│     ├─ Wayback Machine archives                                 │
│     └─ Google Scholar "All versions"                            │
│                                                                  │
│  3. GREY AREA (Use with caution)                               │
│     ├─ Archive.is for individual articles                       │
│     ├─ Disable JavaScript (breaks functionality)                │
│     └─ VPNs for geo-blocked content                            │
│                                                                  │
│  4. NOT RECOMMENDED                                             │
│     ├─ Credential sharing                                       │
│     ├─ Systematic scraping                                      │
│     └─ Commercial use of bypassed content                       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│              CONTENT ACCESS DECISION HIERARCHY                   │
├─────────────────────────────────────────────────────────────────┤
│                                                                  │
│  1. FULLY LEGAL (Always try first)                              │
│     ├─ Library databases (PressReader, ProQuest, JSTOR)         │
│     ├─ Open access tools (Unpaywall, CORE, PubMed Central)     │
│     ├─ Author direct contact                                    │
│     └─ Interlibrary loan                                        │
│                                                                  │
│  2. LEGAL (Browser features)                                    │
│     ├─ Reader Mode (Safari, Firefox, Edge)                      │
│     ├─ Wayback Machine archives                                 │
│     └─ Google Scholar "All versions"                            │
│                                                                  │
│  3. GREY AREA (Use with caution)                               │
│     ├─ Archive.is for individual articles                       │
│     ├─ Disable JavaScript (breaks functionality)                │
│     └─ VPNs for geo-blocked content                            │
│                                                                  │
│  4. NOT RECOMMENDED                                             │
│     ├─ Credential sharing                                       │
│     ├─ Systematic scraping                                      │
│     └─ Commercial use of bypassed content                       │
│                                                                  │
└─────────────────────────────────────────────────────────────────┘

Open access tools for academic papers

学术论文开放获取工具

Unpaywall browser extension

Unpaywall浏览器扩展

Unpaywall finds free, legal copies of 20+ million academic papers.
python
undefined
Unpaywall可查找2000多万篇学术论文的免费合法副本。
python
undefined

Unpaywall API (free, requires email for identification)

Unpaywall API (free, requires email for identification)

import requests
def find_open_access(doi: str, email: str) -> dict: """Find open access version of a paper using Unpaywall API.
Args:
    doi: Digital Object Identifier (e.g., "10.1038/nature12373")
    email: Your email for API identification

Returns:
    Dict with best open access URL if available
"""
url = f"https://api.unpaywall.org/v2/{doi}?email={email}"

response = requests.get(url, timeout=30)

if response.status_code != 200:
    return {'error': f'Status {response.status_code}'}

data = response.json()

if data.get('is_oa'):
    best_location = data.get('best_oa_location', {})
    return {
        'is_open_access': True,
        'oa_url': best_location.get('url_for_pdf') or best_location.get('url'),
        'oa_status': data.get('oa_status'),  # gold, green, bronze, hybrid
        'host_type': best_location.get('host_type'),  # publisher, repository
        'version': best_location.get('version')  # publishedVersion, acceptedVersion
    }

return {
    'is_open_access': False,
    'title': data.get('title'),
    'journal': data.get('journal_name')
}
import requests
def find_open_access(doi: str, email: str) -> dict: """Find open access version of a paper using Unpaywall API.
Args:
    doi: Digital Object Identifier (e.g., "10.1038/nature12373")
    email: Your email for API identification

Returns:
    Dict with best open access URL if available
"""
url = f"https://api.unpaywall.org/v2/{doi}?email={email}"

response = requests.get(url, timeout=30)

if response.status_code != 200:
    return {'error': f'Status {response.status_code}'}

data = response.json()

if data.get('is_oa'):
    best_location = data.get('best_oa_location', {})
    return {
        'is_open_access': True,
        'oa_url': best_location.get('url_for_pdf') or best_location.get('url'),
        'oa_status': data.get('oa_status'),  # gold, green, bronze, hybrid
        'host_type': best_location.get('host_type'),  # publisher, repository
        'version': best_location.get('version')  # publishedVersion, acceptedVersion
    }

return {
    'is_open_access': False,
    'title': data.get('title'),
    'journal': data.get('journal_name')
}

Usage

Usage

result = find_open_access("10.1038/nature12373", "researcher@example.com") if result.get('is_open_access'): print(f"Free PDF at: {result['oa_url']}")
undefined
result = find_open_access("10.1038/nature12373", "researcher@example.com") if result.get('is_open_access'): print(f"Free PDF at: {result['oa_url']}")
undefined

CORE API (295M papers)

CORE API(2.95亿篇论文)

python
undefined
python
undefined

CORE API - requires free API key from https://core.ac.uk/

CORE API - requires free API key from https://core.ac.uk/

import requests
class CORESearch: def init(self, api_key: str): self.api_key = api_key self.base_url = "https://api.core.ac.uk/v3"
def search(self, query: str, limit: int = 10) -> list:
    """Search CORE database for open access papers."""

    headers = {'Authorization': f'Bearer {self.api_key}'}
    params = {
        'q': query,
        'limit': limit
    }

    response = requests.get(
        f"{self.base_url}/search/works",
        headers=headers,
        params=params,
        timeout=30
    )

    if response.status_code != 200:
        return []

    data = response.json()
    results = []

    for item in data.get('results', []):
        results.append({
            'title': item.get('title'),
            'authors': [a.get('name') for a in item.get('authors', [])],
            'year': item.get('yearPublished'),
            'doi': item.get('doi'),
            'download_url': item.get('downloadUrl'),
            'abstract': item.get('abstract', '')[:500]
        })

    return results

def get_by_doi(self, doi: str) -> dict:
    """Get paper by DOI."""
    headers = {'Authorization': f'Bearer {self.api_key}'}

    response = requests.get(
        f"{self.base_url}/works/{doi}",
        headers=headers,
        timeout=30
    )

    return response.json() if response.status_code == 200 else {}
undefined
import requests
class CORESearch: def init(self, api_key: str): self.api_key = api_key self.base_url = "https://api.core.ac.uk/v3"
def search(self, query: str, limit: int = 10) -> list:
    """Search CORE database for open access papers."""

    headers = {'Authorization': f'Bearer {self.api_key}'}
    params = {
        'q': query,
        'limit': limit
    }

    response = requests.get(
        f"{self.base_url}/search/works",
        headers=headers,
        params=params,
        timeout=30
    )

    if response.status_code != 200:
        return []

    data = response.json()
    results = []

    for item in data.get('results', []):
        results.append({
            'title': item.get('title'),
            'authors': [a.get('name') for a in item.get('authors', [])],
            'year': item.get('yearPublished'),
            'doi': item.get('doi'),
            'download_url': item.get('downloadUrl'),
            'abstract': item.get('abstract', '')[:500]
        })

    return results

def get_by_doi(self, doi: str) -> dict:
    """Get paper by DOI."""
    headers = {'Authorization': f'Bearer {self.api_key}'}

    response = requests.get(
        f"{self.base_url}/works/{doi}",
        headers=headers,
        timeout=30
    )

    return response.json() if response.status_code == 200 else {}
undefined

Semantic Scholar API (214M papers)

Semantic Scholar API(2.14亿篇论文)

python
undefined
python
undefined

Semantic Scholar API - free, no key required for basic use

Semantic Scholar API - free, no key required for basic use

import requests
def search_semantic_scholar(query: str, limit: int = 10) -> list: """Search Semantic Scholar for papers with open access links."""
url = "https://api.semanticscholar.org/graph/v1/paper/search"
params = {
    'query': query,
    'limit': limit,
    'fields': 'title,authors,year,abstract,openAccessPdf,citationCount'
}

response = requests.get(url, params=params, timeout=30)

if response.status_code != 200:
    return []

results = []
for paper in response.json().get('data', []):
    oa_pdf = paper.get('openAccessPdf', {})
    results.append({
        'title': paper.get('title'),
        'authors': [a.get('name') for a in paper.get('authors', [])],
        'year': paper.get('year'),
        'citations': paper.get('citationCount', 0),
        'open_access_url': oa_pdf.get('url') if oa_pdf else None,
        'abstract': paper.get('abstract', '')[:500] if paper.get('abstract') else ''
    })

return results
def get_paper_by_doi(doi: str) -> dict: """Get paper details by DOI.""" url = f"https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}" params = { 'fields': 'title,authors,year,abstract,openAccessPdf,references,citations' }
response = requests.get(url, params=params, timeout=30)
return response.json() if response.status_code == 200 else {}
undefined
import requests
def search_semantic_scholar(query: str, limit: int = 10) -> list: """Search Semantic Scholar for papers with open access links."""
url = "https://api.semanticscholar.org/graph/v1/paper/search"
params = {
    'query': query,
    'limit': limit,
    'fields': 'title,authors,year,abstract,openAccessPdf,citationCount'
}

response = requests.get(url, params=params, timeout=30)

if response.status_code != 200:
    return []

results = []
for paper in response.json().get('data', []):
    oa_pdf = paper.get('openAccessPdf', {})
    results.append({
        'title': paper.get('title'),
        'authors': [a.get('name') for a in paper.get('authors', [])],
        'year': paper.get('year'),
        'citations': paper.get('citationCount', 0),
        'open_access_url': oa_pdf.get('url') if oa_pdf else None,
        'abstract': paper.get('abstract', '')[:500] if paper.get('abstract') else ''
    })

return results
def get_paper_by_doi(doi: str) -> dict: """Get paper details by DOI.""" url = f"https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}" params = { 'fields': 'title,authors,year,abstract,openAccessPdf,references,citations' }
response = requests.get(url, params=params, timeout=30)
return response.json() if response.status_code == 200 else {}
undefined

Browser reader mode for soft paywalls

软付费墙的浏览器阅读器模式

Activating reader mode

激活阅读器模式

javascript
// Bookmarklet to trigger Firefox-style reader mode
// Works on some soft paywalls that load content before blocking

javascript:(function(){
    // Try to extract article content
    var article = document.querySelector('article') ||
                  document.querySelector('[role="main"]') ||
                  document.querySelector('.article-body') ||
                  document.querySelector('.post-content');

    if (article) {
        // Remove paywall overlays
        document.querySelectorAll('[class*="paywall"], [class*="subscribe"], [id*="paywall"]')
            .forEach(el => el.remove());

        // Remove fixed position overlays
        document.querySelectorAll('*').forEach(el => {
            var style = getComputedStyle(el);
            if (style.position === 'fixed' && style.zIndex > 100) {
                el.remove();
            }
        });

        // Re-enable scrolling
        document.body.style.overflow = 'auto';
        document.documentElement.style.overflow = 'auto';

        console.log('Overlay removed. Content may now be visible.');
    }
})();
javascript
// Bookmarklet to trigger Firefox-style reader mode
// Works on some soft paywalls that load content before blocking

javascript:(function(){
    // Try to extract article content
    var article = document.querySelector('article') ||
                  document.querySelector('[role="main"]') ||
                  document.querySelector('.article-body') ||
                  document.querySelector('.post-content');

    if (article) {
        // Remove paywall overlays
        document.querySelectorAll('[class*="paywall"], [class*="subscribe"], [id*="paywall"]')
            .forEach(el => el.remove());

        // Remove fixed position overlays
        document.querySelectorAll('*').forEach(el => {
            var style = getComputedStyle(el);
            if (style.position === 'fixed' && style.zIndex > 100) {
                el.remove();
            }
        });

        // Re-enable scrolling
        document.body.style.overflow = 'auto';
        document.documentElement.style.overflow = 'auto';

        console.log('Overlay removed. Content may now be visible.');
    }
})();

Reader mode by browser

各浏览器的阅读器模式

BrowserHow to ActivateEffectiveness
SafariClick Reader icon in URL barHigh for soft paywalls
FirefoxClick Reader View icon (or F9)High
EdgeClick Immersive Reader iconHighest
ChromeRequires flag: chrome://flags/#enable-reader-modeMedium
浏览器激活方式有效程度
Safari点击地址栏中的阅读器图标对软付费墙效果好
Firefox点击阅读器视图图标(或按F9)效果好
Edge点击沉浸式阅读器图标效果最佳
Chrome需要启用标志:chrome://flags/#enable-reader-mode效果一般

Library database access

图书馆数据库访问

Checking library access programmatically

程序化检查图书馆访问权限

python
undefined
python
undefined

Most library databases require authentication

Most library databases require authentication

This shows how to structure library API access

This shows how to structure library API access

class LibraryAccess: """Access pattern for library databases."""
# Common library database endpoints
DATABASES = {
    'pressreader': {
        'base': 'https://www.pressreader.com',
        'auth': 'library_card',
        'content': '7000+ newspapers/magazines'
    },
    'proquest': {
        'base': 'https://www.proquest.com',
        'auth': 'institutional',
        'content': 'news, dissertations, documents'
    },
    'jstor': {
        'base': 'https://www.jstor.org',
        'auth': 'institutional',
        'content': 'academic journals, books'
    },
    'nexis_uni': {
        'base': 'https://www.nexisuni.com',
        'auth': 'institutional',
        'content': 'legal, news, business'
    }
}

@staticmethod
def get_pressreader_access_methods():
    """Ways to access PressReader through libraries."""
    return {
        'in_library': 'Connect to library WiFi, visit pressreader.com',
        'remote': 'Log in with library card credentials',
        'app': 'Download PressReader app, link library card',
        'note': 'Access typically 30-48 hours per session'
    }
class LibraryAccess: """Access pattern for library databases."""
# Common library database endpoints
DATABASES = {
    'pressreader': {
        'base': 'https://www.pressreader.com',
        'auth': 'library_card',
        'content': '7000+ newspapers/magazines'
    },
    'proquest': {
        'base': 'https://www.proquest.com',
        'auth': 'institutional',
        'content': 'news, dissertations, documents'
    },
    'jstor': {
        'base': 'https://www.jstor.org',
        'auth': 'institutional',
        'content': 'academic journals, books'
    },
    'nexis_uni': {
        'base': 'https://www.nexisuni.com',
        'auth': 'institutional',
        'content': 'legal, news, business'
    }
}

@staticmethod
def get_pressreader_access_methods():
    """Ways to access PressReader through libraries."""
    return {
        'in_library': 'Connect to library WiFi, visit pressreader.com',
        'remote': 'Log in with library card credentials',
        'app': 'Download PressReader app, link library card',
        'note': 'Access typically 30-48 hours per session'
    }

Interlibrary Loan (ILL) workflow

Interlibrary Loan (ILL) workflow

def request_via_ill(paper_info: dict, library_email: str) -> str: """Generate interlibrary loan request.
ILL is free through most libraries and can get almost any paper.
Turnaround: typically 3-7 days.
"""

request = f"""
INTERLIBRARY LOAN REQUEST

Title: {paper_info.get('title')}
Author(s): {paper_info.get('authors')}
Journal: {paper_info.get('journal')}
Year: {paper_info.get('year')}
DOI: {paper_info.get('doi')}
Volume/Issue: {paper_info.get('volume')}/{paper_info.get('issue')}
Pages: {paper_info.get('pages')}

Requested by: {library_email}
"""

return request.strip()
undefined
def request_via_ill(paper_info: dict, library_email: str) -> str: """Generate interlibrary loan request.
ILL is free through most libraries and can get almost any paper.
Turnaround: typically 3-7 days.
"""

request = f"""
INTERLIBRARY LOAN REQUEST

Title: {paper_info.get('title')}
Author(s): {paper_info.get('authors')}
Journal: {paper_info.get('journal')}
Year: {paper_info.get('year')}
DOI: {paper_info.get('doi')}
Volume/Issue: {paper_info.get('volume')}/{paper_info.get('issue')}
Pages: {paper_info.get('pages')}

Requested by: {library_email}
"""

return request.strip()
undefined

VPN usage for geo-blocked content

针对地域限制内容的VPN使用

When VPNs are appropriate

VPN的适用场景

markdown
undefined
markdown
undefined

Legitimate VPN use cases for journalists/researchers

新闻工作者/研究人员的合法VPN使用场景

APPROPRIATE:

适用场景:

  • Accessing region-specific news sources
  • Researching how content appears in other countries
  • Bypassing government censorship (in some contexts)
  • Protecting source communications
  • Verifying geo-targeted content
  • 访问特定地区的新闻来源
  • 研究内容在其他国家的呈现形式
  • 绕过政府审查(部分场景下)
  • 保护消息来源的通信安全
  • 验证地域定向内容

INAPPROPRIATE:

不适用场景:

  • Circumventing legitimate access controls
  • Accessing content you're contractually prohibited from viewing
  • Evading bans or blocks placed on your account
undefined
  • 规避合法的访问控制
  • 访问合同禁止查看的内容
  • 逃避账户的封禁或限制
undefined

VPN service comparison

VPN服务对比

ServiceBest ForPrivacySpeedPrice
ExpressVPNCensorship bypassExcellentFast$$$
NordVPNGeneral useExcellentFast$$
SurfsharkBudget, unlimited devicesGoodGood$
ProtonVPNPrivacy-focusedExcellentMedium$$
Tor BrowserMaximum anonymityExcellentSlowFree
服务最佳适用场景隐私性速度价格
ExpressVPN突破审查极佳$$$
NordVPN通用场景极佳$$
Surfshark预算有限、多设备良好良好$
ProtonVPN注重隐私极佳中等$$
Tor Browser最高匿名性极佳免费

Checking geo-restriction status

检查地域限制状态

python
import requests

def check_geo_access(url: str, regions: list = None) -> dict:
    """Check if URL is accessible from different regions.

    Note: This requires VPN/proxy services for actual testing.
    This function shows the concept.
    """

    regions = regions or ['US', 'UK', 'EU', 'JP', 'AU']

    results = {}

    # Direct access test
    try:
        response = requests.get(url, timeout=10)
        results['direct'] = {
            'accessible': response.status_code == 200,
            'status_code': response.status_code
        }
    except Exception as e:
        results['direct'] = {'accessible': False, 'error': str(e)}

    # Would need VPN/proxy integration for regional testing
    # results[region] = test_through_proxy(url, region)

    return results
python
import requests

def check_geo_access(url: str, regions: list = None) -> dict:
    """Check if URL is accessible from different regions.

    Note: This requires VPN/proxy services for actual testing.
    This function shows the concept.
    """

    regions = regions or ['US', 'UK', 'EU', 'JP', 'AU']

    results = {}

    # Direct access test
    try:
        response = requests.get(url, timeout=10)
        results['direct'] = {
            'accessible': response.status_code == 200,
            'status_code': response.status_code
        }
    except Exception as e:
        results['direct'] = {'accessible': False, 'error': str(e)}

    # Would need VPN/proxy integration for regional testing
    # results[region] = test_through_proxy(url, region)

    return results

Archive-based access

基于存档的访问方式

Using Archive.today for paywalled articles

使用Archive.today获取付费墙后的文章

python
import requests
from urllib.parse import quote

def get_archived_article(url: str) -> str:
    """Try to get article from Archive.today.

    Archive.today often captures full article content
    because it renders JavaScript and captures the result.

    Legal status varies by jurisdiction - use for research purposes.
    """

    # Check for existing archive
    search_url = f"https://archive.today/{quote(url, safe='')}"

    try:
        response = requests.get(search_url, timeout=30, allow_redirects=True)

        if response.status_code == 200 and 'archive.today' in response.url:
            return response.url

        # No existing archive - could request one
        # Note: This may violate ToS, use responsibly
        return None

    except Exception:
        return None
python
import requests
from urllib.parse import quote

def get_archived_article(url: str) -> str:
    """Try to get article from Archive.today.

    Archive.today often captures full article content
    because it renders JavaScript and captures the result.

    Legal status varies by jurisdiction - use for research purposes.
    """

    # Check for existing archive
    search_url = f"https://archive.today/{quote(url, safe='')}"

    try:
        response = requests.get(search_url, timeout=30, allow_redirects=True)

        if response.status_code == 200 and 'archive.today' in response.url:
            return response.url

        # No existing archive - could request one
        # Note: This may violate ToS, use responsibly
        return None

    except Exception:
        return None

Wayback Machine for historical access

使用Wayback Machine进行历史访问

python
def get_wayback_article(url: str) -> str:
    """Get article from Wayback Machine.

    100% legal - the Internet Archive is a recognized library.
    May have older versions of articles (before paywall implemented).
    """

    # Check availability
    api_url = f"http://archive.org/wayback/available?url={url}"

    try:
        response = requests.get(api_url, timeout=10)
        data = response.json()

        snapshot = data.get('archived_snapshots', {}).get('closest', {})

        if snapshot.get('available'):
            return snapshot['url']

        return None
    except Exception:
        return None
python
def get_wayback_article(url: str) -> str:
    """Get article from Wayback Machine.

    100% legal - the Internet Archive is a recognized library.
    May have older versions of articles (before paywall implemented).
    """

    # Check availability
    api_url = f"http://archive.org/wayback/available?url={url}"

    try:
        response = requests.get(api_url, timeout=10)
        data = response.json()

        snapshot = data.get('archived_snapshots', {}).get('closest', {})

        if snapshot.get('available'):
            return snapshot['url']

        return None
    except Exception:
        return None

Google Scholar strategies

Google Scholar使用策略

Finding free versions

查找免费版本

python
def find_free_via_scholar(title: str) -> list:
    """Search strategies for finding free paper versions.

    Google Scholar often links to:
    - Author's personal website copies
    - Institutional repository versions
    - ResearchGate/Academia.edu uploads
    """

    strategies = [
        {
            'method': 'scholar_all_versions',
            'description': 'Click "All X versions" under result',
            'success_rate': 'Medium-High'
        },
        {
            'method': 'scholar_pdf_link',
            'description': 'Look for [PDF] link on right side',
            'success_rate': 'Medium'
        },
        {
            'method': 'title_plus_pdf',
            'description': f'Search: "{title}" filetype:pdf',
            'success_rate': 'Medium'
        },
        {
            'method': 'author_site',
            'description': 'Find author\'s academic page',
            'success_rate': 'Medium'
        },
        {
            'method': 'preprint_servers',
            'description': 'Search arXiv, SSRN, bioRxiv',
            'success_rate': 'Field-dependent'
        }
    ]

    return strategies
python
def find_free_via_scholar(title: str) -> list:
    """Search strategies for finding free paper versions.

    Google Scholar often links to:
    - Author's personal website copies
    - Institutional repository versions
    - ResearchGate/Academia.edu uploads
    """

    strategies = [
        {
            'method': 'scholar_all_versions',
            'description': 'Click "All X versions" under result',
            'success_rate': 'Medium-High'
        },
        {
            'method': 'scholar_pdf_link',
            'description': 'Look for [PDF] link on right side',
            'success_rate': 'Medium'
        },
        {
            'method': 'title_plus_pdf',
            'description': f'Search: "{title}" filetype:pdf',
            'success_rate': 'Medium'
        },
        {
            'method': 'author_site',
            'description': 'Find author\'s academic page',
            'success_rate': 'Medium'
        },
        {
            'method': 'preprint_servers',
            'description': 'Search arXiv, SSRN, bioRxiv',
            'success_rate': 'Field-dependent'
        }
    ]

    return strategies

Direct author contact

直接联系作者

Email template for paper requests

论文请求邮件模板

python
def generate_paper_request_email(paper: dict, requester: dict) -> str:
    """Generate professional email requesting paper from author.

    Authors are typically happy to share their work.
    Success rate: Very high (70-90%).
    """

    template = f"""
Subject: Request for paper: {paper['title'][:50]}...

Dear Dr./Prof. {paper['author_last_name']},

I am a {requester['role']} at {requester['institution']}, researching
{requester['research_area']}.

I came across your paper "{paper['title']}" published in
{paper['journal']} ({paper['year']}), and I believe it would be
highly relevant to my work on {requester['specific_project']}.

Unfortunately, I don't have access through my institution. Would you
be willing to share a copy?

I would be happy to properly cite your work in any resulting publications.

Thank you for your time and for your contribution to the field.

Best regards,
{requester['name']}
{requester['title']}
{requester['institution']}
{requester['email']}
"""

    return template.strip()
python
def generate_paper_request_email(paper: dict, requester: dict) -> str:
    """Generate professional email requesting paper from author.

    Authors are typically happy to share their work.
    Success rate: Very high (70-90%).
    """

    template = f"""
Subject: Request for paper: {paper['title'][:50]}...

Dear Dr./Prof. {paper['author_last_name']},

I am a {requester['role']} at {requester['institution']}, researching
{requester['research_area']}.

I came across your paper "{paper['title']}" published in
{paper['journal']} ({paper['year']}), and I believe it would be
highly relevant to my work on {requester['specific_project']}.

Unfortunately, I don't have access through my institution. Would you
be willing to share a copy?

I would be happy to properly cite your work in any resulting publications.

Thank you for your time and for your contribution to the field.

Best regards,
{requester['name']}
{requester['title']}
{requester['institution']}
{requester['email']}
"""

    return template.strip()

Access strategy by content type

按内容类型分类的访问策略

News articles

新闻文章

markdown
undefined
markdown
undefined

News article access strategies

新闻文章访问策略

  1. Library PressReader - 7,000+ publications worldwide
  2. Reader Mode - Works on ~50% of soft paywalls
  3. Archive.org - For older articles
  4. Archive.today - For recent articles (grey area)
  5. Google search - Sometimes cached versions appear
  1. 图书馆PressReader - 全球7000+出版物
  2. 阅读器模式 - 对约50%的软付费墙有效
  3. Archive.org - 获取旧文章
  4. Archive.today - 获取近期文章(灰色地带)
  5. 谷歌搜索 - 有时会出现缓存版本

Tips:

提示:

  • Many newspapers offer free articles for .edu emails
  • Press releases often contain same info as paywalled articles
  • Local library cards often include digital news access
  • Some publications have free tiers (5-10 articles/month)
undefined
  • 许多报纸为.edu邮箱提供免费文章
  • 新闻稿通常包含与付费墙文章相同的信息
  • 本地图书馆卡通常包含数字新闻访问权限
  • 部分出版物提供免费层级(每月5-10篇文章)
undefined

Academic papers

学术论文

markdown
undefined
markdown
undefined

Academic paper access strategies (in order)

学术论文访问策略(按优先级排序)

  1. Unpaywall extension - Check first, automatic
  2. Google Scholar - Click "All versions", look for [PDF]
  3. Author's website - Check their academic page
  4. Institutional repository - Search university library
  5. Preprint servers - arXiv, SSRN, bioRxiv, medRxiv
  6. ResearchGate/Academia.edu - Author-uploaded copies
  7. CORE.ac.uk - 295M open access papers
  8. PubMed Central - For biomedical papers
  9. Contact author directly - High success rate
  10. Interlibrary Loan - Free, gets almost anything
undefined
  1. Unpaywall扩展 - 首先尝试,自动检测
  2. Google Scholar - 点击"所有X个版本",查找[PDF]链接
  3. 作者个人网站 - 查看其学术主页
  4. 机构知识库 - 搜索大学图书馆
  5. 预印本服务器 - arXiv、SSRN、bioRxiv、medRxiv
  6. ResearchGate/Academia.edu - 作者上传的副本
  7. CORE.ac.uk - 2.95亿篇开放获取论文
  8. PubMed Central - 生物医学论文
  9. 直接联系作者 - 成功率高
  10. 馆际互借 - 免费,几乎能获取所有内容
undefined

Books and reports

书籍和报告

markdown
undefined
markdown
undefined

Book/report access strategies

书籍/报告访问策略

  1. Library digital lending - Internet Archive, OverDrive
  2. Google Books - Often has preview or full text
  3. HathiTrust - Academic library consortium
  4. Project Gutenberg - Public domain books
  5. OpenLibrary - Internet Archive's book lending
  6. Publisher open access - Some chapters/reports free
  7. Author/organization website - Reports often available
  8. Interlibrary Loan - Physical books, scanned chapters
undefined
  1. 图书馆数字借阅 - Internet Archive、OverDrive
  2. 谷歌图书 - 通常有预览或全文
  3. HathiTrust - 学术图书馆联盟
  4. 古腾堡计划 - 公共领域书籍
  5. OpenLibrary - Internet Archive的书籍借阅平台
  6. 出版商开放获取 - 部分章节/报告免费
  7. 作者/组织官网 - 报告通常可获取
  8. 馆际互借 - 实体书籍、扫描章节
undefined

Legal and ethical framework

法律与伦理框架

Fair use considerations (US)

合理使用考量(美国)

markdown
undefined
markdown
undefined

Fair Use Factors (17 U.S.C. § 107)

合理使用因素(美国法典第17篇第107条)

  1. Purpose and character of use
    • Transformative use (commentary, criticism) favored
    • Non-commercial/educational use favored
    • Journalism generally protected
  2. Nature of copyrighted work
    • Factual works (news, research) - broader fair use
    • Creative works (fiction, art) - narrower fair use
  3. Amount used relative to whole
    • Using only necessary portions favored
    • Heart of the work disfavored
  4. Effect on market
    • Not replacing purchase disfavored
    • No market impact favored
  1. 使用目的与性质
    • 转换性使用(评论、批评)更受支持
    • 非商业/教育用途更受支持
    • 新闻报道通常受保护
  2. 受版权保护作品的性质
    • 事实性作品(新闻、研究)- 合理使用范围更广
    • 创意作品(小说、艺术)- 合理使用范围更窄
  3. 使用内容占整体的比例
    • 仅使用必要部分更受支持
    • 使用作品核心内容不受支持
  4. 对市场的影响
    • 不会替代正版购买的使用更受支持
    • 对市场无影响的使用更受支持

Journalism privilege:

新闻工作者特权:

News reporting is explicitly listed as fair use purpose. However, wholesale copying of entire articles still problematic.
undefined
新闻报道被明确列为合理使用的目的之一。 但全文复制整篇文章仍存在问题。
undefined

Best practices for researchers

研究人员最佳实践

markdown
undefined
markdown
undefined

Ethical content access guidelines

合规内容访问指南

DO:

建议:

  • Use library resources first (supports the ecosystem)
  • Try open access tools before circumvention
  • Contact authors directly (they want citations)
  • Cite properly regardless of how you accessed content
  • Budget for subscriptions to frequently-used sources
  • 优先使用图书馆资源(支持内容生态)
  • 在使用规避方法前尝试开放获取工具
  • 直接联系作者(他们希望自己的成果被引用)
  • 无论获取方式如何,都要正确引用
  • 为常用来源订阅付费服务

DON'T:

禁止:

  • Share login credentials
  • Systematically download entire databases
  • Use bypassed content for commercial purposes
  • Redistribute paywalled content
  • Rely solely on bypass methods
undefined
  • 共享登录凭证
  • 系统性下载整个数据库
  • 将绕过付费墙获取的内容用于商业目的
  • 重新分发付费墙内容
  • 仅依赖规避方法
undefined