AutomationOpen SourceFreeActiveMachine-verified· beginner · ~2 min setup

Dedupe and Rank a Keyword List with Coreutils

Turn a messy keyword dump into a clean, frequency-ranked list using only shell builtins.

by Shilpa Mitra· verified 10d ago· v1.0.0

Run this workflow

CI-verified, 2/2 fixtures passing.

Intended Use

Anyone who pastes a raw keyword/tag dump and wants a deduped, frequency-sorted list without reaching for a tool or a SaaS.

Not for

  • Datasets over a few million lines (use a real DB)
  • Fuzzy/semantic dedupe, this is exact-match only

The Stack

Tested Against

bash@5.xcoreutils@9.x

Side effects & data flow

Network
none, local only
Writes
./ranked.txt
Credentials
none required

Steps

  1. 1

    Normalize, count, and rank

    Lowercase every line, collapse duplicates with a count, and sort by frequency descending.

    printf 'RAG\nrag\nAgents\nrag\nAGENTS\nlocal-llm\n' | tr 'A-Z' 'a-z' | sort | uniq -c | sort -rn

Eval, 2 fixtures

Last passed: verified 10d ago
  • rag-ranked-topcontainstimeout 10s · max $0

    Expected: 3 rag

  • clean-exitexit_codetimeout 10s · max $0

    Expected: 0

Results

Replaces a paid keyword-cleaning SaaS for a one-off task; runs in <1s.