news-aggregation

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

News Aggregation (Multi-Source, 3-Day Window)

新闻聚合（多来源，3天时间窗口）

Collect latest news from multiple sites and aggregators, merge similar stories into short topics, and list all main source links under each topic.

从多个网站和聚合平台收集最新新闻，将相似报道合并为简短主题，并在每个主题下列出所有主要来源链接。

When to use

使用场景

You want one concise briefing from many outlets.
You need deduplicated coverage (same story from multiple sites).
You want source transparency (all original links shown).
You want a default time window of the last 3 days unless specified otherwise.

希望从多个媒体获取一份简洁的新闻简报。
需要去重后的新闻报道（同一事件来自多个网站的内容）。
希望来源透明（显示所有原始链接）。
默认使用过去3天的时间窗口，除非用户指定其他范围。

Required tools / APIs

所需工具/API

No API keys required for basic RSS workflow.
Python 3.10+

Install:

bash

pip install feedparser python-dateutil

基础RSS工作流无需API密钥。
Python 3.10+

安装：

bash

pip install feedparser python-dateutil

Sources (news sites + aggregators)

来源（新闻网站+聚合平台）

Use a mixed source list for better coverage.

使用混合来源列表以获得更全面的报道。

News sites (RSS)

新闻网站（RSS）

Reuters World:

https://feeds.reuters.com/Reuters/worldNews

AP Top News:
```
https://feeds.apnews.com/apnews/topnews
```

BBC World:

http://feeds.bbci.co.uk/news/world/rss.xml

Al Jazeera:

https://www.aljazeera.com/xml/rss/all.xml

The Guardian World:
```
https://www.theguardian.com/world/rss
```
NPR News:
```
https://feeds.npr.org/1001/rss.xml
```

路透社世界新闻:

https://feeds.reuters.com/Reuters/worldNews

美联社头条新闻:
```
https://feeds.apnews.com/apnews/topnews
```

BBC世界新闻:

http://feeds.bbci.co.uk/news/world/rss.xml

半岛电视台:

https://www.aljazeera.com/xml/rss/all.xml

卫报世界新闻:
```
https://www.theguardian.com/world/rss
```
NPR新闻:
```
https://feeds.npr.org/1001/rss.xml
```

Aggregators (RSS/API)

聚合平台（RSS/API）

Google News (topic feed):

https://news.google.com/rss/search?q=world

Bing News (RSS query):

https://www.bing.com/news/search?q=world&format=RSS

Hacker News (tech):
```
https://hnrss.org/frontpage
```
Reddit News (community signal):
```
https://www.reddit.com/r/news/.rss
```

谷歌新闻（主题订阅源）:

https://news.google.com/rss/search?q=world

必应新闻（RSS查询）:

https://www.bing.com/news/search?q=world&format=RSS

黑客新闻（科技）:
```
https://hnrss.org/frontpage
```
Reddit新闻（社区信号）:
```
https://www.reddit.com/r/news/.rss
```

Skills

实现技巧

Node.js quick fetch + grouping starter

Node.js快速抓取与分组入门

javascript

// npm install rss-parser
const Parser = require('rss-parser');
const parser = new Parser();

const SOURCES = {
  Reuters: 'https://feeds.reuters.com/Reuters/worldNews',
  AP: 'https://feeds.apnews.com/apnews/topnews',
  BBC: 'http://feeds.bbci.co.uk/news/world/rss.xml',
  'Google News': 'https://news.google.com/rss/search?q=world'
};

async function fetchRecent(days = 3) {
  const cutoff = Date.now() - days * 24 * 60 * 60 * 1000;
  const all = [];

  for (const [source, url] of Object.entries(SOURCES)) {
    const feed = await parser.parseURL(url);
    for (const item of feed.items || []) {
      const ts = new Date(item.pubDate || item.isoDate || 0).getTime();
      if (!ts || ts < cutoff) continue;
      all.push({ source, title: item.title || '', link: item.link || '', ts });
    }
  }

  return all.sort((a, b) => b.ts - a.ts);
}

// Next step: add title-similarity clustering (same idea as Python section above)