Compare commits

...

30 Commits

Author SHA1 Message Date
Leonora Tindall ac21e800b5 This is 0.5.3 2022-12-31 16:26:09 -06:00
Leonora Tindall 6ab07b1c06 Fix links on index page. 2022-12-31 16:21:08 -06:00
Leonora Tindall 888c9ec744 This is 0.5.2. 2022-12-05 09:48:55 -06:00
Leonora Tindall b99aa07d88 Remove HTTP cache. 2022-12-05 09:48:17 -06:00
Leonora Tindall c8f236b6c0
This is 0.5.1 2022-11-03 23:34:10 -05:00
Leonora Tindall 81890b7d7d
Add better logging 2022-11-03 23:30:59 -05:00
Leonora Tindall 6e782a72f7
This is 0.5.0 2022-11-02 14:50:47 -05:00
Leonora Tindall 95f6887eda
Support media envelopes and feed w/o transparent shares 2022-11-02 14:49:54 -05:00
Leonora Tindall 0f7af7c13a
Better support for direct shares 2022-11-02 14:39:01 -05:00
Leonora Tindall c4c486dc8a
Add support for Media Envelope for uploads 2022-11-02 14:37:45 -05:00
Leonora Tindall 8ed1790cd2
Nicer homepage theme 2022-11-02 14:35:11 -05:00
Leonora Tindall 63a61c1355
Add HTTP and data cacheing for speed 2022-11-02 14:34:50 -05:00
Leonora Tindall caa0723c1c
Add bookmarklet 2022-11-01 18:23:47 -05:00
Leonora Tindall 7b55238f49
This is 0.4.0 2022-11-01 14:50:42 -05:00
Leonora Tindall 51a8e1ff88
Add post-slurping (get markdown source for post) 2022-11-01 14:50:14 -05:00
Leonora Tindall f1a0944688
Even better leniant parsing 2022-11-01 14:50:04 -05:00
Leonora Tindall fbcc5d536e
This is 0.3.1 2022-11-01 14:15:52 -05:00
Leonora Tindall 55a2610eff
More lenient parsing 2022-11-01 14:15:28 -05:00
Leonora Tindall d39e19beb2
This is 0.3.0 2022-11-01 14:11:31 -05:00
Leonora Tindall bf14f554ee
Don't pagniate, just fetch everything 2022-11-01 14:11:19 -05:00
Leonora Tindall bd012f491a
Refactor routes 2022-11-01 13:57:33 -05:00
Leonora Tindall 01fd48e9ca
This is 0.2.2 2022-10-31 23:38:24 -05:00
Leonora Tindall 07b8dc30c7
Fix large CW indicators 2022-10-31 23:38:03 -05:00
Leonora Tindall 7fce1dab55
This is 0.2.1 2022-10-31 23:33:54 -05:00
Leonora Tindall 05717b06a7
Don't add spurious {} to CW posts 2022-10-31 23:33:24 -05:00
Leonora Tindall 530ffe6b03
Vendor modified RSS to fix atom:link issue. This is 0.2.0. 2022-10-31 22:59:25 -05:00
Leonora Tindall a737cb1008
Fix link rel from prev to previous 2022-10-31 22:57:54 -05:00
Leonora Tindall 6556a0b9b1
Update homepage 2022-10-31 22:06:10 -05:00
Leonora Tindall 26e00bae9f
Use a proper user agent string 2022-10-31 21:53:32 -05:00
Leonora Tindall 17941d90be
Links and correct sorting 2022-10-31 21:41:07 -05:00
14 changed files with 6226 additions and 267 deletions

1
.gitignore vendored
View File

@ -1,2 +1,3 @@
/target
mastodon-data.toml
http-cacache

3
.gitmodules vendored Normal file
View File

@ -0,0 +1,3 @@
[submodule "rss"]
path = rss
url = git@github.com:NoraCodes/rss.git

236
Cargo.lock generated
View File

@ -78,6 +78,12 @@ dependencies = [
"syn",
]
[[package]]
name = "async_once"
version = "0.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2ce4f10ea3abcd6617873bae9f91d1c5332b4a778bd9ce34d0cd517474c1de82"
[[package]]
name = "atom_syndication"
version = "0.11.0"
@ -157,10 +163,47 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ec8a7b6a70fde80372154c65702f00a0f56f3e1c36abbc6c440484be248856db"
[[package]]
name = "cc"
version = "1.0.73"
name = "cached"
version = "0.40.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2fff2a6927b3bb87f9595d67196a70493f627687a71d87a0d692242c33f58c11"
checksum = "72b4147cd94d5fbdc2ab71b11d50a2f45493625576b3bb70257f59eedea69f3d"
dependencies = [
"async-trait",
"async_once",
"cached_proc_macro",
"cached_proc_macro_types",
"futures",
"hashbrown",
"instant",
"lazy_static",
"once_cell",
"thiserror",
"tokio",
]
[[package]]
name = "cached_proc_macro"
version = "0.15.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "751f7f4e7a091545e7f6c65bacc404eaee7e87bfb1f9ece234a1caa173dc16f2"
dependencies = [
"cached_proc_macro_types",
"darling 0.13.4",
"quote",
"syn",
]
[[package]]
name = "cached_proc_macro_types"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3a4f925191b4367301851c6d99b09890311d74b0d43f274c0b34c86d308a3663"
[[package]]
name = "cc"
version = "1.0.74"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "581f5dba903aac52ea3feb5ec4810848460ee833876f1f9b0fdeab1f19091574"
[[package]]
name = "cfg-if"
@ -259,22 +302,6 @@ dependencies = [
"version_check",
]
[[package]]
name = "cookie_store"
version = "0.16.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2e4b6aa369f41f5faa04bb80c9b1f4216ea81646ed6124d76ba5c49a7aafd9cd"
dependencies = [
"cookie",
"idna 0.2.3",
"log",
"publicsuffix",
"serde",
"serde_json",
"time 0.3.16",
"url",
]
[[package]]
name = "core-foundation"
version = "0.9.3"
@ -293,11 +320,13 @@ checksum = "5827cebf4670468b8772dd191856768aedcb1b0278a04f989f7766351917b9dc"
[[package]]
name = "corobel"
version = "0.1.0"
version = "0.5.3"
dependencies = [
"cached",
"chrono",
"clap",
"eggbug",
"mime",
"mime_guess",
"once_cell",
"pulldown-cmark",
"reqwest",
@ -387,8 +416,18 @@ version = "0.12.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5f2c43f534ea4b0b049015d00269734195e6d3f0f6635cb692251aca6f9f8b3c"
dependencies = [
"darling_core",
"darling_macro",
"darling_core 0.12.4",
"darling_macro 0.12.4",
]
[[package]]
name = "darling"
version = "0.13.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a01d95850c592940db9b8194bc39f4bc0e89dee5c4265e4b1807c34a9aba453c"
dependencies = [
"darling_core 0.13.4",
"darling_macro 0.13.4",
]
[[package]]
@ -405,13 +444,38 @@ dependencies = [
"syn",
]
[[package]]
name = "darling_core"
version = "0.13.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "859d65a907b6852c9361e3185c862aae7fafd2887876799fa55f5f99dc40d610"
dependencies = [
"fnv",
"ident_case",
"proc-macro2",
"quote",
"strsim",
"syn",
]
[[package]]
name = "darling_macro"
version = "0.12.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "29b5acf0dea37a7f66f7b25d2c5e93fd46f8f6968b1a5d7a3e02e97768afc95a"
dependencies = [
"darling_core",
"darling_core 0.12.4",
"quote",
"syn",
]
[[package]]
name = "darling_macro"
version = "0.13.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9c972679f83bdf9c42bd905396b6c3588a843a17f0f16dfcfa3e2c5d57441835"
dependencies = [
"darling_core 0.13.4",
"quote",
"syn",
]
@ -431,7 +495,7 @@ version = "0.10.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "66e616858f6187ed828df7c64a6d71720d83767a7f19740b2d1b6fe6327b36e5"
dependencies = [
"darling",
"darling 0.12.4",
"proc-macro2",
"quote",
"syn",
@ -447,17 +511,6 @@ dependencies = [
"syn",
]
[[package]]
name = "derive_more"
version = "0.99.17"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4fb810d30a7c1953f91334de7244731fc3f3c10d7fe163338a35b9f640960321"
dependencies = [
"proc-macro2",
"quote",
"syn",
]
[[package]]
name = "devise"
version = "0.3.1"
@ -511,29 +564,6 @@ dependencies = [
"chrono",
]
[[package]]
name = "eggbug"
version = "0.1.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4004d4ec28b9e2c564617596bebe2fe586d2c2baef0b34311c358ccf875530d4"
dependencies = [
"base64",
"bytes",
"derive_more",
"futures",
"hmac",
"pbkdf2",
"reqwest",
"serde",
"serde_json",
"sha2",
"thiserror",
"tokio",
"tokio-util",
"tracing",
"uuid",
]
[[package]]
name = "either"
version = "1.8.0"
@ -820,9 +850,9 @@ checksum = "c4a1e36c821dbe04574f602848a19f742f4fb3c98d40449f11bcad18d6b17421"
[[package]]
name = "hyper"
version = "0.14.20"
version = "0.14.22"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "02c929dc5c39e335a03c405292728118860721b10190d98c2a0f0efd5baafbac"
checksum = "abfba89e19b959ca163c7752ba59d737c1ceea53a5d31a149c805446fc958064"
dependencies = [
"bytes",
"futures-channel",
@ -885,17 +915,6 @@ version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b9e0384b61958566e926dc50660321d12159025e767c18e043daf26b70104c39"
[[package]]
name = "idna"
version = "0.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "418a0a6fab821475f634efe3ccc45c013f742efe03d853e8d3355d5cb850ecf8"
dependencies = [
"matches",
"unicode-bidi",
"unicode-normalization",
]
[[package]]
name = "idna"
version = "0.3.0"
@ -1026,12 +1045,6 @@ dependencies = [
"regex-automata",
]
[[package]]
name = "matches"
version = "0.1.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a3e378b66a060d48947b590737b30a1be76706c8dd7b8ba0f2fe3989c68a853f"
[[package]]
name = "memchr"
version = "2.5.0"
@ -1088,9 +1101,9 @@ dependencies = [
[[package]]
name = "native-tls"
version = "0.2.10"
version = "0.2.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fd7e2f3618557f980e0b17e8856252eee3c97fa12c54dff0ca290fb6266ca4a9"
checksum = "07226173c32f2926027b63cce4bcd8076c3552846cbe7925f3aaffeac0a3b92e"
dependencies = [
"lazy_static",
"libc",
@ -1141,9 +1154,9 @@ dependencies = [
[[package]]
name = "num_cpus"
version = "1.13.1"
version = "1.14.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "19e64526ebdee182341572e50e9ad03965aa510cd94427a4549448f285e957a1"
checksum = "f6058e64324c71e02bc2b150e4f3bc8286db6c83092132ffa3f6b1eab0f9def5"
dependencies = [
"hermit-abi",
"libc",
@ -1250,15 +1263,6 @@ dependencies = [
"windows-sys 0.42.0",
]
[[package]]
name = "pbkdf2"
version = "0.11.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "83a0692ec44e4cf1ef28ca317f14f8f07da2d95ec3fa01f86e4467b725e60917"
dependencies = [
"digest",
]
[[package]]
name = "pear"
version = "0.2.3"
@ -1348,12 +1352,6 @@ dependencies = [
"version_check",
]
[[package]]
name = "proc-macro-hack"
version = "0.5.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dbf0c48bc1d91375ae5c3cd81e3722dff1abcf81a30960240640d223f59fe0e5"
[[package]]
name = "proc-macro2"
version = "1.0.47"
@ -1376,22 +1374,6 @@ dependencies = [
"yansi",
]
[[package]]
name = "psl-types"
version = "2.0.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33cb294fe86a74cbcf50d4445b37da762029549ebeea341421c7c70370f86cac"
[[package]]
name = "publicsuffix"
version = "2.2.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "96a8c1bda5ae1af7f99a2962e49df150414a43d62404644d98dd5c3a93d07457"
dependencies = [
"idna 0.3.0",
"psl-types",
]
[[package]]
name = "pulldown-cmark"
version = "0.9.2"
@ -1464,18 +1446,18 @@ dependencies = [
[[package]]
name = "ref-cast"
version = "1.0.12"
version = "1.0.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "12a733f1746c929b4913fe48f8697fcf9c55e3304ba251a79ffb41adfeaf49c2"
checksum = "53b15debb4f9d60d767cd8ca9ef7abb2452922f3214671ff052defc7f3502c44"
dependencies = [
"ref-cast-impl",
]
[[package]]
name = "ref-cast-impl"
version = "1.0.12"
version = "1.0.13"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5887de4a01acafd221861463be6113e6e87275e79804e56779f4cdc131c60368"
checksum = "abfa8511e9e94fd3de6585a3d3cd00e01ed556dc9814829280af0e8dc72a8f36"
dependencies = [
"proc-macro2",
"quote",
@ -1523,8 +1505,6 @@ checksum = "431949c384f4e2ae07605ccaa56d1d9d2ecdb5cadd4f9577ccfab29f2e5149fc"
dependencies = [
"base64",
"bytes",
"cookie",
"cookie_store",
"encoding_rs",
"futures-core",
"futures-util",
@ -1537,18 +1517,15 @@ dependencies = [
"js-sys",
"log",
"mime",
"mime_guess",
"native-tls",
"once_cell",
"percent-encoding",
"pin-project-lite",
"proc-macro-hack",
"serde",
"serde_json",
"serde_urlencoded",
"tokio",
"tokio-native-tls",
"tokio-util",
"tower-service",
"url",
"wasm-bindgen",
@ -1676,9 +1653,9 @@ dependencies = [
[[package]]
name = "scoped-tls"
version = "1.0.0"
version = "1.0.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ea6a9290e3c9cf0f18145ef7ffa62d68ee0bf5fcd651017e586dc7fd5da448c2"
checksum = "e1cf6437eb19a8f4a6cc0f7dca544973b0b78843adbfeb3683d1a94a0024a294"
[[package]]
name = "scopeguard"
@ -2199,19 +2176,10 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0d68c799ae75762b8c3fe375feb6600ef5602c883c5d21eb51c09f22b83c4643"
dependencies = [
"form_urlencoded",
"idna 0.3.0",
"idna",
"percent-encoding",
]
[[package]]
name = "uuid"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "feb41e78f93363bb2df8b0e86a2ca30eed7806ea16ea0c790d757cf93f79be83"
dependencies = [
"serde",
]
[[package]]
name = "valuable"
version = "0.1.0"

View File

@ -1,19 +1,21 @@
[package]
name = "corobel"
version = "0.1.0"
version = "0.5.3"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
clap = { version = "4.0.18", features = [ "derive" ] }
eggbug = { version = "0.1.2", features = [ "tokio" ] }
reqwest = "0.11.12"
reqwest = { version = "0.11.12", features = [ "json" ] }
rocket = { version = "0.5.0-rc.2", features = [ "json" ] }
serde = { version = "1.0.147", features = [ "derive" ] }
tokio = { version = "1.21.2", features = [ "full" ] }
serde_json = "1.0.87"
once_cell = "1.16.0"
chrono = { version = "0.4.22", features = [ "serde" ] }
rss = { version = "2.0.1", features = [ "builders", "atom", "chrono" ] }
rss = { version = "2", features = [ "builders", "chrono" ] }
pulldown-cmark = "0.9.2"
cached = "0.40.0"
mime = "0.3.16"
mime_guess = "2.0.4"

View File

@ -16,9 +16,15 @@ ports to use for development and deployment.
- [ ] Handle redirects
- [x] RSS feeds for projects
- [x] Index page explaining what's going on
- [x] Better support for transparent shares
- [x] Add feed without shares
- [ ] More robust parsing (defaults for all!)
- [ ] RSS feeds for tags
- [ ] Atom Extension pagination support
- [x] Atom Extension pagination support
- [x] Disable pagination
- [x] HTTP Cacheing
- [x] Data cacheing
- [x] Nicer theme
- [ ] Read More support
- [ ] Dublin Core support
- [ ] Media Envelope support
- [x] Media Envelope support

1
rss Submodule

@ -0,0 +1 @@
Subproject commit 1b001f74ff947b70b6bfe48711a9f29f682dd988

View File

@ -0,0 +1 @@
{"nItems":0,"nPages":0,"items":[],"_links":[{"href":"/api/v1/project/vogon","rel":"project","type":"GET"},{"href":"/api/v1/project/vogon/posts?page=998","rel":"prev","type":"GET"}]}

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

View File

@ -3,7 +3,7 @@ use serde::Deserialize;
/// The API URL from whence Cohost serves JSON project definitions
pub const COHOST_ACCOUNT_API_URL: &str = "https://cohost.org/api/v1/project/";
#[derive(Debug, Deserialize, PartialEq, Eq)]
#[derive(Debug, Clone, Deserialize, PartialEq, Eq)]
pub struct CohostAccount {
#[serde(rename = "projectId")]
pub project_id: u64,

View File

@ -1,5 +1,5 @@
use chrono::{DateTime, Utc};
use serde::Deserialize;
use serde::{Deserialize, Deserializer};
pub fn cohost_posts_api_url(project: impl AsRef<str>, page: u64) -> String {
format!(
@ -11,7 +11,7 @@ pub fn cohost_posts_api_url(project: impl AsRef<str>, page: u64) -> String {
// Cohost doesn't give us Next links ("rel: next") for further pages, so we'll have to ALWAYS populate the rel=next field
#[derive(Deserialize)]
#[derive(Debug, Clone, Deserialize)]
pub struct CohostPostsPage {
#[serde(rename = "nItems")]
pub number_items: usize,
@ -22,45 +22,96 @@ pub struct CohostPostsPage {
pub links: Vec<CohostPostLink>,
}
#[derive(Deserialize)]
#[derive(Debug, Clone, Deserialize)]
pub struct CohostPost {
#[serde(rename = "postId")]
pub id: u64,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub headline: String,
#[serde(rename = "publishedAt")]
pub published_at: DateTime<Utc>,
pub cws: Vec<String>,
pub tags: Vec<String>,
#[serde(rename = "plainTextBody")]
#[serde(
rename = "plainTextBody",
deserialize_with = "deserialize_null_default",
default
)]
pub plain_body: String,
#[serde(rename = "singlePostPageUrl")]
#[serde(
rename = "singlePostPageUrl",
deserialize_with = "deserialize_null_default",
default
)]
pub url: String,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub blocks: Vec<CohostPostBlock>,
#[serde(rename = "transparentShareOfPostId")]
pub transparent_share_of_post_id: Option<u64>,
#[serde(rename = "postingProject")]
pub poster: CohostPostingProject,
#[serde(rename = "shareTree")]
pub share_tree: Vec<CohostPost>,
}
#[derive(Deserialize)]
#[derive(Debug, Clone, Deserialize)]
pub struct CohostPostingProject {
#[serde(rename = "projectId")]
pub id: u64,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub handle: String,
#[serde(rename = "displayName")]
#[serde(
rename = "displayName",
deserialize_with = "deserialize_null_default",
default
)]
pub display_name: String,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub dek: String,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub description: String,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub pronouns: String,
}
#[derive(Deserialize)]
#[derive(Debug, Clone, Deserialize)]
pub struct CohostPostLink {
#[serde(deserialize_with = "deserialize_null_default", default)]
pub href: String,
#[serde(deserialize_with = "deserialize_null_default", default)]
pub rel: String,
#[serde(rename = "type")]
#[serde(
rename = "type",
deserialize_with = "deserialize_null_default",
default
)]
pub t_type: String,
}
#[derive(Debug, Clone, Deserialize)]
pub struct CohostPostBlock {
pub attachment: Option<CohostPostAttachment>,
}
#[derive(Debug, Clone, Deserialize)]
pub struct CohostPostAttachment {
#[serde(
rename = "fileURL",
deserialize_with = "deserialize_null_default",
default
)]
pub file_url: String,
}
fn deserialize_null_default<'de, D, T>(deserializer: D) -> Result<T, D::Error>
where
T: Default + Deserialize<'de>,
D: Deserializer<'de>,
{
let opt = Option::deserialize(deserializer)?;
Ok(opt.unwrap_or_default())
}
#[test]
fn test_deserialize() -> Result<(), Box<dyn std::error::Error>> {
let post_page_json = include_str!("../samples/cohost/api/v1/project_posts.json");
@ -71,3 +122,19 @@ fn test_deserialize() -> Result<(), Box<dyn std::error::Error>> {
assert_eq!(post.poster.id, 32693);
Ok(())
}
#[test]
fn test_deserialize_weird() -> Result<(), Box<dyn std::error::Error>> {
let post_page_json = include_str!("../samples/cohost/api/v1/vogon_pathological.json");
let _post_page_actual: CohostPostsPage = serde_json::from_str(post_page_json)?;
Ok(())
}
#[test]
fn test_deserialize_empty() -> Result<(), Box<dyn std::error::Error>> {
let post_page_json = include_str!("../samples/cohost/api/v1/empty_posts_age.json");
let post_page_actual: CohostPostsPage = serde_json::from_str(post_page_json)?;
println!("{:?}", post_page_actual);
assert!(post_page_actual.items.is_empty());
Ok(())
}

View File

@ -3,7 +3,8 @@ use std::collections::HashMap;
use std::error::Error;
#[macro_use]
extern crate rocket;
use reqwest::StatusCode;
use cached::proc_macro::cached;
use reqwest::{Client, StatusCode};
use rocket::response::content::RawHtml;
use rocket::serde::json::Json;
@ -12,7 +13,7 @@ mod cohost_posts;
mod syndication;
mod webfinger;
use cohost_account::{CohostAccount, COHOST_ACCOUNT_API_URL};
use cohost_posts::{cohost_posts_api_url, CohostPostsPage};
use cohost_posts::{cohost_posts_api_url, CohostPost, CohostPostsPage};
use webfinger::CohostWebfingerResource;
#[derive(Parser, Debug)]
@ -30,7 +31,22 @@ fn default_base_url() -> String {
"/".into()
}
fn user_agent() -> String {
format!(
"{}/{} (RSS feed converter) on {}",
env!("CARGO_PKG_NAME"),
env!("CARGO_PKG_VERSION"),
&ARGS.domain
)
}
static ARGS: once_cell::sync::Lazy<Args> = once_cell::sync::Lazy::new(|| Args::parse());
static CLIENT: once_cell::sync::Lazy<Client> = once_cell::sync::Lazy::new(|| {
reqwest::Client::builder()
.user_agent(user_agent())
.build()
.unwrap()
});
#[get("/")]
fn index() -> RawHtml<&'static str> {
@ -38,12 +54,18 @@ fn index() -> RawHtml<&'static str> {
}
#[derive(Responder)]
#[response(content_type = "text/markdown")]
struct MdResponse {
inner: String,
}
#[derive(Debug, Clone, Responder)]
#[response(content_type = "application/rss+xml")]
struct RssResponse {
inner: String,
}
#[derive(Responder)]
#[derive(Debug, Responder)]
#[response(content_type = "text/plain")]
enum ErrorResponse {
#[response(status = 404)]
@ -52,132 +74,178 @@ enum ErrorResponse {
InternalError(String),
}
#[get("/<project>/feed.rss?<page>")]
async fn syndication_rss_route(
project: &str,
page: Option<u64>,
) -> Result<RssResponse, ErrorResponse> {
let page = page.unwrap_or(0);
let project_url = format!("{}{}", COHOST_ACCOUNT_API_URL, project);
let posts_url = cohost_posts_api_url(project, page);
eprintln!("making request to {}", project_url);
let project_data: CohostAccount = match reqwest::get(project_url).await {
Ok(v) => match v.status() {
StatusCode::OK => match v.json::<CohostAccount>().await {
Ok(a) => a,
Err(e) => {
let err = format!("Couldn't deserialize Cohost project '{}': {:?}", project, e);
eprintln!("{}", err);
return Err(ErrorResponse::InternalError(err));
}
},
// TODO NORA: Handle possible redirects
s => {
let err = format!(
"Didn't receive status code 200 for Cohost project '{}'; got {:?} instead.",
project, s
);
eprintln!("{}", err);
return Err(ErrorResponse::NotFound(err));
#[cached(time = 60, result)]
async fn get_post_from_page(project_id: String, post_id: u64) -> Result<CohostPost, ErrorResponse> {
let mut page = 0;
loop {
let new_page = get_page_data(project_id.clone(), page).await?;
if new_page.items.is_empty() {
// Once there are no posts, we're done.
return Err(ErrorResponse::NotFound(
"End of posts reached, ID not found.".into(),
));
} else {
page += 1;
if let Some(post) = new_page.items.into_iter().find(|post| post.id == post_id) {
return Ok(post);
}
},
Err(e) => {
let err = format!(
"Error making request to Cohost for project '{}': {:?}",
project, e
);
eprintln!("{}", err);
return Err(ErrorResponse::InternalError(err));
}
};
}
}
eprintln!("making request to {}", posts_url);
match reqwest::get(posts_url).await {
#[cached(time = 120, result)]
async fn get_full_post_data(project_id: String) -> Result<CohostPostsPage, ErrorResponse> {
let mut page = 0;
let mut merged_page = get_page_data(project_id.clone(), page).await?;
loop {
let mut new_page = get_page_data(project_id.clone(), page).await?;
if new_page.items.is_empty() {
// Once there are no posts, we're done.
break;
} else {
page += 1;
merged_page.number_items += new_page.number_items;
merged_page.items.append(&mut new_page.items);
}
}
Ok(merged_page)
}
// Not cached because it's never used individually.
async fn get_page_data(project_id: String, page: u64) -> Result<CohostPostsPage, ErrorResponse> {
let posts_url = cohost_posts_api_url(&project_id, page);
eprintln!("[INT] making request to {}", posts_url);
match CLIENT.get(posts_url).send().await {
Ok(v) => match v.status() {
StatusCode::OK => match v.json::<CohostPostsPage>().await {
Ok(page_data) => {
return Ok(RssResponse {
inner: syndication::channel_for_posts_page(
project,
page,
project_data,
page_data,
)
.to_string(),
});
}
Ok(page_data) => Ok(page_data),
Err(e) => {
let err = format!(
"Couldn't deserialize Cohost posts page for '{}': {:?}",
project, e
project_id, e
);
eprintln!("{}", err);
eprintln!("[ERR] {}", err);
return Err(ErrorResponse::InternalError(err));
}
},
// TODO NORA: Handle possible redirects
s => {
let err = format!("Didn't receive status code 200 for posts for Cohost project '{}'; got {:?} instead.", page, s);
eprintln!("{}", err);
eprintln!("[ERR] {}", err);
return Err(ErrorResponse::NotFound(err));
}
},
Err(e) => {
let err = format!(
"Error making request to Cohost for posts for project '{}': {:?}",
project, e
project_id, e
);
eprintln!("{}", err);
eprintln!("[ERR] {}", err);
return Err(ErrorResponse::InternalError(err));
}
};
}
}
#[cached(time = 60, result)]
async fn get_project_data(project_id: String) -> Result<CohostAccount, ErrorResponse> {
let project_url = format!("{}{}", COHOST_ACCOUNT_API_URL, project_id);
eprintln!("[INT] making request to {}", project_url);
match CLIENT.get(project_url).send().await {
Ok(v) => match v.status() {
StatusCode::OK => match v.json::<CohostAccount>().await {
Ok(a) => Ok(a),
Err(e) => {
let err = format!(
"Couldn't deserialize Cohost project '{}': {:?}",
project_id, e
);
eprintln!("[ERR] {}", err);
Err(ErrorResponse::InternalError(err))
}
},
// TODO NORA: Handle possible redirects
s => {
let err = format!(
"Didn't receive status code 200 for Cohost project '{}'; got {:?} instead.",
project_id, s
);
eprintln!("[ERR] {}", err);
Err(ErrorResponse::NotFound(err))
}
},
Err(e) => {
let err = format!(
"Error making request to Cohost for project '{}': {:?}",
project_id, e
);
eprintln!("[ERR] {}", err);
Err(ErrorResponse::InternalError(err))
}
}
}
#[get("/<project>/originals.rss")]
async fn syndication_originals_rss_route(project: String) -> Result<RssResponse, ErrorResponse> {
eprintln!("[EXT] Request to /{}/originals.rss", project);
let project_data = get_project_data(project.clone()).await?;
let page_data = get_full_post_data(project.clone()).await?;
Ok(RssResponse {
inner: syndication::channel_for_posts_page(project.clone(), project_data, page_data, true)
.to_string(),
})
}
#[get("/<project>/feed.rss")]
async fn syndication_rss_route(project: String) -> Result<RssResponse, ErrorResponse> {
eprintln!("[EXT] Request to /{}/feed.rss", project);
let project_data = get_project_data(project.clone()).await?;
let page_data = get_full_post_data(project.clone()).await?;
Ok(RssResponse {
inner: syndication::channel_for_posts_page(project.clone(), project_data, page_data, false)
.to_string(),
})
}
#[get("/<project>/<id>")]
async fn post_md_route(project: String, id: u64) -> Result<MdResponse, ErrorResponse> {
eprintln!("[EXT] Request to /{}/{}", project, id);
let _project_data = get_project_data(project.clone()).await?;
let post_data = get_post_from_page(project.clone(), id).await?;
Ok(MdResponse {
inner: post_data.plain_body,
})
}
#[get("/.well-known/webfinger?<params..>")]
async fn webfinger_route(params: HashMap<String, String>) -> Option<Json<CohostWebfingerResource>> {
async fn webfinger_route(
params: HashMap<String, String>,
) -> Result<Json<CohostWebfingerResource>, ErrorResponse> {
let mut url_params_string = String::new();
for (k, v) in params.iter() {
url_params_string.push_str(&format!("{}={}&", k, v));
}
eprintln!(
"[EXT] Request to /.well_known/webfinger?{}",
url_params_string
);
if params.len() != 1 {
eprintln!(
let err = format!(
"Too may or too few parameters. Expected 1, got {}",
params.len()
);
return None;
eprintln!("[ERR] {}", err);
return Err(ErrorResponse::InternalError(err));
}
if let Some(param) = params.iter().next() {
let url = format!("{}{}", COHOST_ACCOUNT_API_URL, param.0);
eprintln!("making request to {}", url);
match reqwest::get(url).await {
Ok(v) => {
match v.status() {
StatusCode::OK => match v.json::<CohostAccount>().await {
Ok(_v) => {
return Some(Json(CohostWebfingerResource::new(
param.0.as_str(),
&ARGS.domain,
&ARGS.base_url,
)));
}
Err(e) => {
eprintln!("Couldn't deserialize Cohost project '{}': {:?}", param.0, e);
}
},
// TODO NORA: Handle possible redirects
s => {
eprintln!("Didn't receive status code 200 for Cohost project '{}'; got {:?} instead.", param.0, s);
return None;
}
}
}
Err(e) => {
eprintln!(
"Error making request to Cohost for project '{}': {:?}",
param.0, e
);
return None;
}
};
let _project_data = get_project_data(param.0.clone()).await?;
Ok(Json(CohostWebfingerResource::new(
param.0.as_str(),
&ARGS.domain,
&ARGS.base_url,
)))
} else {
Err(ErrorResponse::NotFound("No project ID provided.".into()))
}
None
}
#[rocket::main]
@ -187,7 +255,13 @@ async fn main() -> Result<(), Box<dyn Error>> {
let _rocket = rocket::build()
.mount(
&ARGS.base_url,
routes![index, webfinger_route, syndication_rss_route],
routes![
index,
webfinger_route,
syndication_rss_route,
syndication_originals_rss_route,
post_md_route
],
)
.ignite()
.await?

View File

@ -5,27 +5,28 @@ use rss::Channel;
pub fn channel_for_posts_page(
project_name: impl AsRef<str>,
page_number: u64,
project: CohostAccount,
page: CohostPostsPage,
mut page: CohostPostsPage,
originals_only: bool,
) -> Channel {
let project_name = project_name.as_ref().clone();
let mut builder = rss::ChannelBuilder::default();
builder
.title(format!("{} Cohost Posts", project.display_name))
.title(format!(
"{} Cohost Posts{}",
project.display_name,
if originals_only { "" } else { " and Shares" }
))
.description(project.description)
.generator(Some(format!(
"{} {}",
env!("CARGO_CRATE_NAME"),
env!("CARGO_PKG_VERSION")
)))
.link(format!(
"https://cohost.org/{}?page={}",
project_name.as_ref(),
page_number
));
.link(format!("https://cohost.org/{}", project_name,));
page.items.sort_by_key(|item| item.published_at);
let mut items = Vec::with_capacity(page.number_items);
for item in page.items {
let mut item_builder = rss::ItemBuilder::default();
let mut categories: Vec<rss::Category> = Vec::with_capacity(item.tags.len());
@ -47,16 +48,24 @@ pub fn channel_for_posts_page(
.pub_date(item.published_at.to_rfc2822())
.source(Some(rss::Source {
title: Some(format!("{} Cohost Posts", project.display_name)),
url: format!("https://{}/feed/{}.rss", ARGS.domain, project_name.as_ref()),
url: format!("https://{}/feed/{}.rss", ARGS.domain, project_name),
}));
let mut body_text = String::new();
if item.share_tree.len() == 1 {
body_text.push_str("(in reply to another post)\n---\n")
if let Some(shared_post_id) = item.transparent_share_of_post_id {
if originals_only {
continue;
}
body_text.push_str(&format!(
"(share of post {} without any commentary)\n\n---\n\n",
shared_post_id
));
} else if item.share_tree.len() == 1 {
body_text.push_str("(in reply to another post)\n\n---\n\n")
} else if item.share_tree.len() > 1 {
body_text.push_str(&format!(
"(in reply to {} other posts)\n---\n",
"(in reply to {} other posts)\n\n---\n\n",
item.share_tree.len()
));
}
@ -64,12 +73,12 @@ pub fn channel_for_posts_page(
if item.cws.is_empty() {
body_text.push_str(&item.plain_body);
} else {
body_text.push_str("Sensitive post, body text omitted. Content warnings:{}");
body_text.push_str("Sensitive post, body text omitted. Content warnings:");
for cw in item.cws {
body_text.push_str(&format!(" {},", cw));
}
body_text.pop(); // Remove trailing comma
body_text.push_str("\n---\n")
body_text.push_str("\n\n---\n\n")
};
if !item.tags.is_empty() {
@ -88,9 +97,22 @@ pub fn channel_for_posts_page(
let parser = pulldown_cmark::Parser::new_ext(&body_text, options);
let mut html_output = String::new();
pulldown_cmark::html::push_html(&mut html_output, parser);
item_builder.content(html_output);
for attachment in item.blocks.into_iter().filter_map(|block| block.attachment) {
use mime_guess::from_path as guess_mime_from_path;
use rss::EnclosureBuilder;
let enclosure = EnclosureBuilder::default()
.mime_type(
guess_mime_from_path(&attachment.file_url)
.first_or_octet_stream()
.to_string(),
)
.url(attachment.file_url)
.build();
item_builder.enclosure(enclosure);
}
items.push(item_builder.build());
}

View File

@ -20,21 +20,68 @@
line-height: 1.75;
font-size: 1.25em;
}
h1,h2,h3,h4,h5,h6 {
font-family: sans-serif;
}
h1 {
text-align: center;
}
code {
font-family: monospace;
background-color: black;
color: white;
display: inline-block;
padding: 0px 4px;
border-radius: 4px;
}
a code {
color: white;
background-color: darkblue;
}
a:hover code {
color: darkblue;
background-color: white;
}
</style>
</head>
<body>
<h1>corobel</h1>
<h2>RSS feeds from Cohost pages</h2>
<h2>Standard Data from Cohost Posts and Projects</h2>
<p>
Go to <code>/project_name/feed.rss</code> to get a feed for a project.
For example, <a href="/noracodes/feed.rss"><code>/noracodes/feed.rss</code></a> will give you the feed for my page.
<h3>Project RSS Feeds</h3>
Go to <code>/project_name/feed.rss</code> to get a feed for a project, or <code>/project_name/originals.rss</code> for just original posts (including shared posts with commentary).
For example, <a href="/noracodes/feed.rss"><code>/noracodes/feed.rss</code></a> will give you the feed for my page,
or <a href="/noracodes/original.rss"><code>/noracodes/original.rss</code></a> for just my original posts.
</p>
<p>
<h3>Markdown Extraction</h3>
You can also get a particular post's original plain-text body at <code>/project_name/post_id/</code>, such as
<a href="/noracodes/169186/"><code>/noracodes/169186/</code></a>. (In a Cohost post URL, the ID is the numerical part after <code>/post/</code>.
For instance, in <code>https://cohost.org/noracodes/post/169186-october-update</code>, the ID is "169186".)
Or, drag this bookmarklet: <a href="javascript:(function(){const regex = /^https:\/\/cohost.org\/([a-zA-Z_\-0-9]*)\/post\/([0-9]*)-.*/;const new_loc = window.location.href.replace(regex, 'https://corobel.nora.codes/$1/$2');window.open(new_loc);})()">
Cohost: Extract Source
</a> to your bookmarks bar and then click on it when you're on a Cohost individual post page to download that post's source.
</p>
<p>
<h3>Webfinger Resources</h3>
Webfinger resources for accounts are provided at the Webfinger well-known URL <code>/.well-known/webfinger?project_name</code>.
</p>
<p>
Brought to you by Leonora Tindall, written in Rust with Rocket.
<h3>Technical Details</h3>
Since 0.5.0, Corobel caches various responses to provide better service.
<ul>
<li>Project/account data for <b>60 seconds</b></li>
<li>Individual posts for <b>60 seconds</b></li>
<li>Whole RSS feeds for <b>120 seconds</b></li>
</ul>
This means that if you update a post and then immediately request its source, you might get the old source. Just wait a few seconds.
</p>
<p>
Brought to you by <a href="https://nora.codes">Leonora Tindall</a>, written in Rust with Rocket. Code is <a href="https://git.nora.codes/nora/corobel">online</a>, bug reports should go to my email nora@nora.codes.
</p>
</body>
</html>