{
  "version": "https://jsonfeed.org/version/1",
  "title": "Ian's Digital Garden",
  "home_page_url": "https://ianwwagner.com/",
  "feed_url": "https://ianwwagner.com//tag-shell.json",
  "description": "",
  "items": [
    {
      "id": "https://ianwwagner.com//returning-to-emacs.html",
      "url": "https://ianwwagner.com//returning-to-emacs.html",
      "title": "Returning to Emacs",
      "content_html": "<h1><a href=\"#jetbrains-woes\" aria-hidden=\"true\" class=\"anchor\" id=\"jetbrains-woes\"></a>JetBrains woes</h1>\n<p>I have been a fan of JetBrains products for over a decade by now,\nand an unapologetic lover of IDEs generally.\nI've used PyCharm since shortly after it launched,\nand over the years I've used IntelliJ IDEA,\nWebStorm, DataGrip, RustRover, and more.\nI literally have the all products pack (and have for many years).</p>\n<p>I truly believe that a good IDE can be a productivity multiplier.\nYou get refactoring, jump-to-definition, symbol-aware search,\nsaved build/run configurations, a nice and consistent interface\nto otherwise terrible tooling (looking at you CMake and the half dozen Python package managers\nof the last decade and change).</p>\n<p>But something has changed over the past few years.\nThe quality of the product has generally deteriorated in several ways.\nWith the advent of LSP, the massive lead JetBrains had in &quot;code intelligence&quot;\nhas eroded, and in many cases no longer exists.\nThe resource requirements of the IDE have also ballooned massively,\neven occasionally causing memory pressure on my amply equipped MacBook Pro with 32GB of RAM.</p>\n<p>(Side note: I regularly have 3 JetBrains IDEs open at once because I need to work in many languages,\nand for some reason they refuse to ship a single product that does that.\nI would have paid for such a product.)</p>\n<p>And as if that weren't enough, it seems like I have to restart to install some urgent nagging update\nseveral times/week, usually related to one of their confusing mess of AI plugins\n(is AI Chat what we're supposed to use? Or Junie? Or... what?).\nTo top it all off, stability has gone out the window.\nAt least once/week, I will open my laptop from sleep,\nonly to find out that one or more of my JetBrains IDEs has crashed.\nUsually RustRover.\nWhich also eats up like 30GB of extra disk space for things like macro expansions\nand other code analysis.\nThe taxes are high and increasing on every front.</p>\n<h1><a href=\"#my-philosophy-of-editors\" aria-hidden=\"true\" class=\"anchor\" id=\"my-philosophy-of-editors\"></a>My philosophy of editors</h1>\n<p>So, I decided the time was right to give Emacs another shot.</p>\n<p>If you know me personally, you may recall that I made some strong statements in the past\nto the effect that spending weeks writing thousands of lines of Lua to get the ultimate Neovim config was silly.\nAnd my strongly worded statements of the past were partially based on my own experiences with such editors,\nincluding Emacs.\nBasically, I appreciate that you <em>can</em> &quot;build your own lightsaber&quot;,\nbut I did not consider that to be a good use of my time.\nOne of the reasons I like(d) JetBrains is that I <em>didn't</em> ever need to think about tweaking configs!</p>\n<p>But things have gotten so bad that I figured I'd give it a shot with a few stipulations.</p>\n<ol>\n<li>I would try it for a week, but if it seriously hampered my productivity after a few days, I'd switch back.</li>\n<li>I was only going to spend a few hours configuring it.</li>\n</ol>\n<p>With these constraints, I set off to see if I needed to revise my philosophy of editors.</p>\n<h1><a href=\"#why-emacs\" aria-hidden=\"true\" class=\"anchor\" id=\"why-emacs\"></a>Why Emacs?</h1>\n<p>Aside: why not (Helix|Neovim|Zed|something else)?\nA few reasons, in no particular order:</p>\n<ul>\n<li>I sorta know Emacs. I used it as one of my primary editors for a year or two in the early 2010s.</li>\n<li>I tried Helix for a week last year. It didn't stick; something about &quot;modal editing&quot; just does not fit with my brain.</li>\n<li>I don't mind a terminal per se, but we invented windowing systems decades before I was born and I don't understand the fascination\nwith running <em>everything</em> in a terminal (or a web browser, for that matter :P).</li>\n<li>If I'm going to go through the pain of switching, I want to be confident it'll be around and thriving in another 10 years.\nAnd it should work everywhere, including lesser known platforms like FreeBSD.</li>\n<li>If your movement keys require a QWERTY layout, I will be very annoyed.</li>\n</ul>\n<h1><a href=\"#first-impressions-3-days-in\" aria-hidden=\"true\" class=\"anchor\" id=\"first-impressions-3-days-in\"></a>First impressions (3 days in)</h1>\n<p>So, how's it going so far?\nHere are a few of the highlights.</p>\n<h2><a href=\"#lsps-have-improved-a-lot\" aria-hidden=\"true\" class=\"anchor\" id=\"lsps-have-improved-a-lot\"></a>LSPs have improved a lot!</h2>\n<p>It used to be the case that JetBrains had a dominant position in code analysis.\nThis isn't the case anymore, and most of the languages I use that would benefit from an LSP\nhave a great one available.\nThings have improved a lot, particularly in terms of Emacs integrations,\nover the past decade!\n<a href=\"https://www.gnu.org/software/emacs/manual/html_node/eglot/Eglot-Features.html\"><code>eglot</code></a> is now bundled with Emacs,\nso you don't even need to go out of your way to get some funky packages hooked up\n(like I had to with some flycheck plugin for Haskell back in the day).</p>\n<h3><a href=\"#refactoring-tools-have-also-improved\" aria-hidden=\"true\" class=\"anchor\" id=\"refactoring-tools-have-also-improved\"></a>Refactoring tools have also improved</h3>\n<p>The LSP-guided tools for refactoring have also improved a lot.\nIt used to be that only a &quot;real IDE&quot; had much better than grep and replace.\nI was happy to find that <code>eglot-rename</code> &quot;just worked&quot;.</p>\n<h3><a href=\"#docs\" aria-hidden=\"true\" class=\"anchor\" id=\"docs\"></a>Docs</h3>\n<p>I'm used to hovering my mouse over any bit of code, waiting a few seconds,\nand being greeted by a docs popover.\nThis is now possible in Emacs too with <code>eldoc</code> + your LSP.\nI added the <a href=\"https://github.com/casouri/eldoc-box\"><code>eldoc-box</code></a> plugin and configured it to my liking.</p>\n<h3><a href=\"#quick-fix-actions-work-too\" aria-hidden=\"true\" class=\"anchor\" id=\"quick-fix-actions-work-too\"></a>Quick fix actions work too!</h3>\n<p>So far, every single quick-fix action that I'm used to in RustRover\nseems to be there in the eglot integration with rust-analyzer.\nIt took me a few minutes to realize that this was called <code>eglot-code-actions</code>),\nbut once I figured that out, I was rolling.</p>\n<h2><a href=\"#jump-to-definition-works-great-but-navigation-has-caveats\" aria-hidden=\"true\" class=\"anchor\" id=\"jump-to-definition-works-great-but-navigation-has-caveats\"></a>Jump to definition works great, but navigation has caveats</h2>\n<p>I frequently use the jump-to-definition feature in IDEs.\nUsually by command+clicking.\nYou can do the same in Emacs with <code>M-.</code>, which is a bit weird, but okay.\nI picked up the muscle memory after less than an hour.\nThe weird thing though is what happens next.\nI'm used to JetBrains and most other well-designed software (<em>glares in the general direction of Apple</em>)\n&quot;just working&quot; with the forward+back buttons that many input devices have.\nEmacs did not out of the box.</p>\n<p>One thing JetBrains did fairly well was bookmarking where you were in a file, and even letting you jump back after\nnavigating to the definition or to another file.\nThis had some annoying side effects with multiple tabs, which I won't get into but it worked overall.\nIn Emacs, you can return from a definition jump with <code>M-,</code>, but there is no general navigate forward/backward concept.\nThis is where the build-your-own-lightsaber philosophy comes in I guess.\nI knew I'd hit it eventually.</p>\n<p>I tried out a package called <code>better-jumper</code> but it didn't <em>immediately</em> do what I wanted,\nso I abandoned it.\nI opted instead to simple backward and forward navigation.\nIt works alright.</p>\n<pre><code class=\"language-lisp\">(global-set-key (kbd &quot;&lt;mouse-3&gt;&quot;) #'previous-buffer)\n(global-set-key (kbd &quot;&lt;mouse-4&gt;&quot;) #'next-buffer)\n</code></pre>\n<p>Aside: I had to use <code>C-h k</code> (<code>describe-key</code>) to figure out what the mouse buttons were.\nAdvice I saw online apparently isn't universally applicable,\nand Xorg, macOS, etc. may number the buttons differently!</p>\n<h2><a href=\"#terminal-emulation-within-emacs\" aria-hidden=\"true\" class=\"anchor\" id=\"terminal-emulation-within-emacs\"></a>Terminal emulation within Emacs</h2>\n<p>The emacs <code>shell</code> mode is terrible.\nIt's particularly unusable if you're running any sort of TUI application.\nA friend recommended <a href=\"https://codeberg.org/akib/emacs-eat\"><code>eat</code></a> as an alternative.\nThis worked pretty well out of the box with most things,\nbut when I ran <code>cargo nextest</code> for the first time,\nI was shocked at how slow it was.\nMy test suite which normally runs in under a second took over 30!\nYikes.\nI believe the slowness is because it's implemented in elisp,\nwhich is still pretty slow even when native compilation is enabled.</p>\n<p>Another Emacs user recommended I try out <a href=\"https://github.com/akermu/emacs-libvterm\"><code>vterm</code></a>, so I did.\nHallelujah!\nIt's no iTerm 2, and it does have a few quirks,\nbut it's quite usable and MUCH faster.\nIt also works better with full-screen TUI apps like Claude Code.</p>\n<h2><a href=\"#claude-code-cli-is-actually-great\" aria-hidden=\"true\" class=\"anchor\" id=\"claude-code-cli-is-actually-great\"></a>Claude Code CLI is actually great</h2>\n<p>I'm not going to get into the pros and cons of LLMs in this post.\nBut if you use these tools in your work,\nI think you'll be surprised by how good the experience is with <code>vterm</code> and the <code>claude</code> CLI.\nI have been evaluating JetBrains' disjoint attempts at integrations with Junie,\nand more recently Claude Code and Codex.</p>\n<p>Junie is alright for some things.\nThe only really good thing I have to say about the product is that at least it let me select a GPT model.\nAnthropic models have been severely hampered in their ability to do anything useful in most codebases I work in,\ndue to tiny context windows.\nThat recently changed when Anthropic rolled out a 1 million token context window to certain users.</p>\n<p>JetBrains confusingly refers to Claude Code as &quot;Claude Agent&quot; and team subscriptions automatically include some monthly credits.\nEvery single JetBrains IDE will install its own separate copy of Claude Code (yay).\nBut it <em>is</em> really just shelling out to Claude Code it seems\n(it asks for your permission to download the binary.\nCodex is the same.)</p>\n<p>Given this, I assumed the experience and overall quality would be similar.\nWell, I was VERY wrong there.\nClaude Code in the terminal is far superior for a number of reasons.\nNot just access to the new model though that helps.\nYou can also configure &quot;effort&quot; (lol), and the &quot;plan&quot; mode seems to be far more sophisticated than what you get in the JetBrains IDEs.</p>\n<p>So yeah, if you're going to use these tools, just use the official app.\nIt makes sense; they have an incentive to push people to buy direct.\nAnd it so happens that Claude Code fits comfortably in my Emacs environment.</p>\n<p>More directly relevant to this post,\nLLMs (any of them really) are excellent at recommending Emacs packages and config tweaks.\nSo it's never been easier to give it a try.\nI've spent something like 2-3x longer writing this post than I did configuring Emacs.\n(And yes, before you ask, this post is 100% hand-written.)\nMy basic flow was to work, get annoyed (thats pretty easy for me),\nand describe my problem to ChatGPT or Claude.\nI am nowhere near the hours I budgeted for config fiddling.\nThat surprised me!</p>\n<h2><a href=\"#vcs-integration\" aria-hidden=\"true\" class=\"anchor\" id=\"vcs-integration\"></a>VCS integration</h2>\n<p>While I'm no stranger to hacking around with nothing more than a console,\nI really don't like the git CLI.\nI've heard jj is better, but honestly I think GUIs are pretty great most of the time.\nI will probably try magit at some point,\nbut for now I'm very happy with Sublime Merge.</p>\n<p>But one thing I MUST have in my editor is a &quot;gutter&quot; view of lines that are new/changed,\nand a way to get a quick inline diff.\nJetBrains had a great UX for this which I used daily.\nAnd for Emacs, I found something just as great: <a href=\"https://github.com/dgutov/diff-hl\"><code>diff-hl</code></a>.</p>\n<p>My config for this is very simple:</p>\n<pre><code class=\"language-lisp\">(unless (package-installed-p 'diff-hl)\n  (package-install 'diff-hl))\n(use-package diff-hl\n  :config\n  (global-diff-hl-mode))\n</code></pre>\n<p>To get a quick diff of a section that's changed,\nI use <code>diff-hl-show-chunk</code>.\nI might even like the hunk review experience here better than in JetBrains!</p>\n<h2><a href=\"#project-wide-search\" aria-hidden=\"true\" class=\"anchor\" id=\"project-wide-search\"></a>Project-wide search</h2>\n<p>I think JetBrains has the best search around with their double-shift, cmd+shift+o, and cmd-shift-f views.\nI have not yet gotten my Emacs configured to be as good.\nBut <code>C-x p g</code> (<code>project-find-regexp</code>) is pretty close.\nI'll look into other plugins later for fuzzy filename/symbol search.\nI <em>do</em> miss that.</p>\n<h2><a href=\"#run-configurations\" aria-hidden=\"true\" class=\"anchor\" id=\"run-configurations\"></a>Run configurations</h2>\n<p>The final pleasant surprise is that I don't miss JetBrains run configurations as much as I expected.\nI instead switch to putting a <a href=\"https://just.systems/man/en/introduction.html\"><code>justfile</code></a> in my repo and populating that with my run configurations\n(much of the software I work on has half a dozen switches which vary by environment).\nThis also has the side effect of cleaning up some of my CI configuration (<code>just</code> run the same thing!)\nand also serves as useful documentation to LLMs.</p>\n<h2><a href=\"#spell-checking\" aria-hidden=\"true\" class=\"anchor\" id=\"spell-checking\"></a>Spell checking</h2>\n<p>I have <a href=\"https://github.com/crate-ci/typos\"><code>typos</code></a> configured for most of my projects in CI,\nbut it drives me nuts when an editor doesn't flag typos for me.\nJetBrains did this well.\nEmacs has nothing out of the box (Zed also annoyingly doesn't ship with anything, which is really confusing to me).\nBut it's easy to add.</p>\n<p>I went with Jinx.\nThere are other options, but this one seemed pretty modern and worked without any fuss, so I stuck with it.</p>\n<h1><a href=\"#papercuts-to-solve-later\" aria-hidden=\"true\" class=\"anchor\" id=\"papercuts-to-solve-later\"></a>Papercuts to solve later</h1>\n<p>This is all a lot more positive than I was expecting to be honest!\nI am not going to cancel my JetBrains subscription tomorrow;\nthey still <em>do</em> make the best database tool I know of.\nBut I've moved all my daily editing to Emacs.</p>\n<p>That said, there are still some papercuts I need to address:</p>\n<ul>\n<li>Macro expansion. I liked that in RustRover. There's apparently a way to get this with <code>eglot-x</code> which I'll look into later.</li>\n<li>Automatic indentation doesn't work out of the box for all modes to my liking. I think I've fixed most of these but found the process confusing.</li>\n<li>Files don't reload in buffers automatically with disk changes (e.g. <code>cargo fmt</code>)!</li>\n<li>Code completion and jump to definition don't work inside rustdoc comments.</li>\n<li>RustRover used to highlight all of my <code>mut</code> variables. I would love to get that back in Emacs.</li>\n</ul>\n",
      "summary": "",
      "date_published": "2026-03-18T00:00:00-00:00",
      "image": "",
      "authors": [
        {
          "name": "Ian Wagner",
          "url": "https://fosstodon.org/@ianthetechie",
          "avatar": "media/avi.jpeg"
        }
      ],
      "tags": [
        "software-engineering",
        "shell"
      ],
      "language": "en"
    },
    {
      "id": "https://ianwwagner.com//using-tar-with-your-favorite-compression.html",
      "url": "https://ianwwagner.com//using-tar-with-your-favorite-compression.html",
      "title": "Using tar with Your Favorite Compression",
      "content_html": "<p>Here's a fun one!\nYou may already know that tarball is a pure archive format,\nand that any compression is applied to the whole archive as a unit.\nThat is to say that compression is not actually applied at the <em>file</em> level,\nbut to the entire archive.</p>\n<p>This is a trade-off the designers made to limit complexity,\nand as a side-effect, is the reason why you can't randomly access parts of a compressed tarball.</p>\n<p>What you may not know is that the <code>tar</code> utility has built-in support for a few formats!\nGZIP is probably the most commonly used for historical reasons,\nbut <code>zstd</code> and <code>lz4</code> are built-in options on my Mac.\nThis is probably system-dependent, so check your local manpages.</p>\n<p>Here's an example of compressing and decompressing with <code>zstd</code>:</p>\n<pre><code class=\"language-shell\">tar --zstd -cf directory.tar.zst directory/\ntar --zstd -xf directory.tar.zst\n</code></pre>\n<p>You can also use this with <em>any</em> (de)compression program that operates on stdin and stdout!</p>\n<pre><code class=\"language-shell\">tar --use-compress-program zstd -cf directory.tar.zst directory/\n</code></pre>\n<p>Pretty cool, huh?\nIt's no different that using pipes at the end of the day,\nbut it does simplify the invocation a bit in my opinion.</p>\n<p>After I initially published this article,\n<code>@cartocalypse@norden.social</code> noted that some versions of tar include the\n<code>-a</code>/<code>--auto-compress</code> option which will automatically determine format and compression based on the suffix!\nCheck your manpages for details; it appears to work on FreeBSD, macOS (which inherits the FreeBSD implementation), and GNU tar.</p>\n",
      "summary": "",
      "date_published": "2025-12-14T00:00:00-00:00",
      "image": "",
      "authors": [
        {
          "name": "Ian Wagner",
          "url": "https://fosstodon.org/@ianthetechie",
          "avatar": "media/avi.jpeg"
        }
      ],
      "tags": [
        "shell",
        "compression",
        "tar"
      ],
      "language": "en"
    },
    {
      "id": "https://ianwwagner.com//delightfully-simple-pipelines-with-nushell.html",
      "url": "https://ianwwagner.com//delightfully-simple-pipelines-with-nushell.html",
      "title": "Delightfully Simple Pipelines with Nushell",
      "content_html": "<p>I've been using <a href=\"https://www.nushell.sh/\">nushell</a> as my daily driver for about six months now,\nand wanted to show a few simple examples of why I'm enjoying it so much.\nI think it's a breath of fresh air compared to most shells.</p>\n<h1><a href=\"#why-a-new-shell\" aria-hidden=\"true\" class=\"anchor\" id=\"why-a-new-shell\"></a>Why a new Shell?</h1>\n<p>In case you've never heard of it before, nushell is a, well, new shell ;)\n<code>bash</code> has been the dominant shell for as long as I can remember,\nthough <code>zsh</code> have their fair share of devotees.\n<code>fish</code> is the only recent example I can think of as a &quot;challenger&quot; shell.\n<code>fish</code> gained enough traction that it's supported by tooling such as Python <code>virtualenv</code>\n(which only has integrations out of the box for a handful of shells).\nI think <code>fish</code> is popular because it had some slightly saner defaults out of the box,\nwas easier to &quot;customize&quot; with flashy prompts (which can make your shell SUPER slow to init),\nand had a saner scripting language than <code>bash</code>.\nBut it still retained a lot of the historical baggage from POSIX shells.</p>\n<p>Nushell challenges two common assumptions about shells\nand asks &quot;what if things were different?&quot;</p>\n<ol>\n<li>POSIX compliance is a non-goal.</li>\n<li>Many standard tools from GNU coreutils/base system (e.g. <code>ls</code> and <code>du</code>) are replaced by builtins.</li>\n<li>All nushell &quot;native&quot; utilities produce and consume <strong>structured data</strong> rather than text by default.</li>\n</ol>\n<p>By dropping the goal of POSIX compliance,\nnushell frees itself from decades of baggage.\nThis means you get a scripting language that feels a lot more like Rust.\nYou'll actually get errors by default when you try to do something stupid,\nunlike most shells which will happily proceed,\nusually doing something even more stupid.\nMaybe treating undefined variables as empty string make sense in the 1970s,\nbut that's almost never helpful.</p>\n<p>nushell also takes a relatively unique approach to utilities.\nWhen you type something like <code>ls</code> or <code>ps</code> in nushell,\nthis is handled by a shell builtin!\nIt's just Rust code baked into the shell rather than calling out to GNU coreutils\nor whatever your base system includes.\nThis means that whether you type <code>ps</code> on FreeBSD, Debian, or macOS,\nyou'll get the same behavior!</p>\n<p>I can already hear some readers thinking &quot;doesn't this just massively bloat the shell?&quot;\nNo, not really.\nThe code for these is for less than that of the typical GNU utility,\nbecause nushell actually (IMO) embraces UNIX philosophy even better than the original utilities.\nThey are all extremely minimal and work with other builtins.\nFor example, there are no sorting flags for <code>ls</code>,\nand no format/unit flags for <code>du</code>.</p>\n<p>The reason that nushell <em>can</em> take this approach is because they challenge the notion that\n&quot;text is the universal API.&quot;\nYou <em>can't</em> meaningfully manipulate text without lots of lossy heuristics.\nBut you <em>can</em> do this for structured data!\nI admit I'm a bit of a chaos monkey, so I love to see a project taking a rare new approach\nin space where nothing has fundamentally changed since the 1970s.</p>\n<p>Okay, enough about philosophy... here are a few examples of some shell pipelines I found delightful.</p>\n<h1><a href=\"#elastic-snapshot-status\" aria-hidden=\"true\" class=\"anchor\" id=\"elastic-snapshot-status\"></a>Elastic snapshot status</h1>\n<p>First up: I do a lot of work with Elasticsearch during <code>$DAYJOB</code>.\nOne workflow I have to do fairly often is spin up a cluster and restore from a snapshot.\nThe Elasticsearch API is great... for programs.\nBut I have a hard time grokking hundreds of lines of JSON.\nHere's an example of a pipeline I built which culls the JSON response down to just the section I care about.</p>\n<pre><code class=\"language-nu\">http get &quot;http://localhost:9200/myindex/_recovery&quot;\n  | get restored-indexname\n  | get shards\n  | get index\n  | get size\n</code></pre>\n<p>In case it's not obvious, the <code>http</code> command makes HTTP requests.\nThis is another nushell builtin that is an excellent alternative to <code>curl</code>.\nIt's not as feature-rich (<code>curl</code> has a few decades of head start),\nbut it brings something new to the table: it understands from the response that the content is JSON,\nand converts it into structured data!\nEverything in this pipeline is nushell builtins.\nAnd it'll work on <em>any</em> OS that nushell supports.\nEven Windows!\nThat's wild!</p>\n<p>Pro tip: you can press option+enter to add a new line when typing in the shell.</p>\n<h1><a href=\"#disk-usage-in-bytes\" aria-hidden=\"true\" class=\"anchor\" id=\"disk-usage-in-bytes\"></a>Disk usage in bytes</h1>\n<p>Here's an example I hinted at earlier.\nIf you type <code>help du</code> (to get built-in docs),\nyou won't find any flags for changing the units.\nBut you can do it using formatters like so:</p>\n<pre><code class=\"language-nu\">du path/to/bigfile.bin | format filesize B apparent\n</code></pre>\n<p>The <code>du</code> command <em>always</em> shows human-readable units by default. Which I very much appreciate!\nAnd did you notice <code>apparent</code> at the end there?\nWell, the version of <code>du</code> you'd find with a typical Linux distro doesn't <em>exactly</em> lie to you,\nbut it withholds some very important information.\nThe physical size occupied on disk is not necessarily the same as how large the file\n(in an abstract platonic sense) <em>actually</em> is.</p>\n<p>There are a bunch of reasons for this, but the most impactful one is compressed filesystems.\nIf I ask Linux <code>du</code> how large a file is in an OpenZFS dataset,\nit will report the physical size by default, which may be a few hundred megabytes\nwhen the file is really multiple gigabytes.\nNot <em>necessarily</em> helpful.</p>\n<p>Anyways, the nushell builtin always gives you columns for both physical and apparent.\nSo you can't ignore the fact that these sizes are often different.\nI like that!</p>\n<h1><a href=\"#some-other-helpful-bits-for-switching\" aria-hidden=\"true\" class=\"anchor\" id=\"some-other-helpful-bits-for-switching\"></a>Some other helpful bits for switching</h1>\n<p>If you want to give nushell a try,\nthey have some great documentation.\nRead the basics, but also check out their specific pages on, e.g.\n<a href=\"https://www.nushell.sh/book/coming_from_bash.html\">coming from bash</a>.</p>\n<p>Finally, here are two more that tripped me up at first.</p>\n<ul>\n<li>If you want to get the PID of your process, use <code>$nu.pid</code> instead of <code>$$</code>.</li>\n<li>To access environment <em>variables</em>, you need to be explicit and go through <code>$env</code>. On the plus side, you can now explicitly differentiate from environment variables.</li>\n</ul>\n",
      "summary": "",
      "date_published": "2025-12-13T00:00:00-00:00",
      "image": "",
      "authors": [
        {
          "name": "Ian Wagner",
          "url": "https://fosstodon.org/@ianthetechie",
          "avatar": "media/avi.jpeg"
        }
      ],
      "tags": [
        "shell"
      ],
      "language": "en"
    },
    {
      "id": "https://ianwwagner.com//faster-ssh-file-transfers-with-rsync.html",
      "url": "https://ianwwagner.com//faster-ssh-file-transfers-with-rsync.html",
      "title": "Faster SSH File Transfers with rsync",
      "content_html": "<p>If you're a developer or sysadmin, there's a pretty good chance you've had to transfer files back and forth.\nBack in the old days, you may have used the <code>ftp</code> utility or something similar\n(I think my first one was probably CuteFTP).\nThen you probably thought better of doing things in plaintext,\nand switched to SFTP/SCP, which operate over an SSH connection.</p>\n<p>My quick post today is not exactly a bombshell of new information,\nbut these tools are not the fastest way to transfer.\nThere's often not a huge difference if you're transferring between machines in the same datacenter,\nbut you can do many times better then transferring from home or the office.</p>\n<p>Of course I'm talking about <code>rsync</code>, which has a lot of features like compression, partial transfer resume,\nchecksumming so just the deltas are sent in a large file, and more.\nI don't know why, but it's rarely in my consciousness, and always strikes me as a bit quirky.\nYou need quite a few flags for what I would consider to be reasonable defaults.\nBut if you remember those (or use shel history, like me),\nit can save you a ton of time.</p>\n<p>In fact, this morning, using rsync saved me more time than it took to write this blog post.\nI was transferring a ~40GB file from a server halfway around the world,\nbut I only had to transfer bytes equivalent to 20% of the total.</p>\n<p>Here's a look at the <code>rsync</code> command I often use for pulling files from a remote server\n(I usually do this to download a large build artifact without going through a cloud storage intermediary,\nwhich is both slower and more eexpensive):</p>\n<pre><code class=\"language-shell\">rsync -Pzhv example.com:/remote/path/to/big/file ~/file\n</code></pre>\n<p>It's not really that bad compared to an <code>scp</code> invocation, but those flags make all the difference.\nHere's what they do:</p>\n<ul>\n<li><code>-P</code> - Keeps partially transferred files (in case of interruption) and shows progress during the transfer.</li>\n<li><code>-z</code> - Compresses data on the source server before sending to the destination. This probably isn't a great idea for an intra-datacenter transfer (it just wastes CPU), but it's perfect for long distance transfers over &quot;slower&quot; links (where I'd say slower is something less than like 100Mbps between you and the server... the last part is important because you may have a gigabit link, but the peering arrangements or other issues conspire to limit effective transfer rates to something much lower).</li>\n<li><code>-h</code> - Makes the output more &quot;human-readable.&quot; This shows previfexs like K, M, or G. You can add it twice if you'd like 1024-based values instead of 1000-based. To be honest, I don't know why this isn't the default.</li>\n<li><code>-v</code> - Verbose mode. By default, <code>rsync</code> is silent. This is another behavior that I find strange in the present, but probably made more sense in the era of teletype terminals and very slow links. It's not really that verbose; it just tells you what files are being transferred and a brief summary at the end. You actually have to give <em>two</em> v's (<code>-vv</code>) for <code>rsync</code> to tell you which files it's skipping!</li>\n</ul>\n<p>Hope this helps speed up your next remote file transfer.\nIf there's any options you like which I may have missed, hit me up on Mastodon with a suggestion!</p>\n<p>Bonus pro tip: I had (until I recently switched to nushell) over a decade of accumulated shell history,\nand sometimes it's hard to keep the options straight.\nNaturally my history has a few different variations on a command like <code>rsync</code>.\nRather than searching through manpages with the equivalent of <code>grep</code>,\nI usually go to <a href=\"https://explainshell.com\">explainshell.com</a>.\nIt seems to be a frontend to the manapages that understands the various sections,\nproviding a much quicker explanation of what your switches do!</p>\n",
      "summary": "",
      "date_published": "2025-12-08T00:00:00-00:00",
      "image": "",
      "authors": [
        {
          "name": "Ian Wagner",
          "url": "https://fosstodon.org/@ianthetechie",
          "avatar": "media/avi.jpeg"
        }
      ],
      "tags": [
        "shell"
      ],
      "language": "en"
    },
    {
      "id": "https://ianwwagner.com//searching-for-tiger-features.html",
      "url": "https://ianwwagner.com//searching-for-tiger-features.html",
      "title": "Searching for TIGER Features",
      "content_html": "<p>Today I had a rather peculiar need to search through features from TIGER\nmatching specific attributes.\nThese files are not CSV or JSON, but rather ESRI Shapefiles.\nShapefiles are a binary format which have long outlived their welcome\naccording to many in the industry, but they still persist today.</p>\n<h1><a href=\"#context\" aria-hidden=\"true\" class=\"anchor\" id=\"context\"></a>Context</h1>\n<p>Yeah, so this post probably isn't interesting to very many people,\nbut here's a bit of context in case you don't know what's going on and you're still reading.\nTIGER is a geospatial dataset published by the US government.\nThere's far more to this dataset than fits in this TIL post,\nbut my interest in it lies in finding addresses.\nSpecifically, <em>guessing</em> at where an address might be.</p>\n<p>When you type an address into your maps app,\nthey might not actually have the exact address in their database.\nThis happens more than you might imagine,\nbut you can usually get a pretty good guess of where the address is\nvia a process called interpolation.\nThe basic idea is that you take address data from multiple sources and use that to make a better guess.</p>\n<p>Some of the input to this is existing address points.\nBut there's one really interesting form of data that brings us to today's TIL:\naddress ranges.\nOne of the TIGER datasets is a set of lines (for the roads.\nEach segment is annotated with info letting us know the range of house numbers on each side of the road.</p>\n<p>I happen to use this data for my day job at Stadia Maps,\nwhere I was investigating a data issue today related to our geocoder and TIGER data.</p>\n<h1><a href=\"#getting-the-data\" aria-hidden=\"true\" class=\"anchor\" id=\"getting-the-data\"></a>Getting the data</h1>\n<p>In case you find yourself in a similar situation,\nyou may notice that the data from the government is sitting in an FTP directory,\nwhich contains a bunch of confusingly named ZIP files.\nThe data that I'm interested in (address features)\nhas names like <code>tl_2024_48485_addrfeat.zip</code>.</p>\n<p>The year might be familiar, but what's that other number?\nThat's a FIPS code for the county whose data is contained in the archive.\nYou can find a <a href=\"https://transition.fcc.gov/oet/info/maps/census/fips/fips.txt\">list here</a>.\nThis is somewhat interesting in itself, since the first 2 characters are a state code.\nTexas, in this case.\nThe full number makes up a county: Wichita County, in this case.\nYou can suck down the entire dataset, just one file, or anything in-between\nfrom the <a href=\"https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html\">Census website</a>.</p>\n<h1><a href=\"#searching-for-features\" aria-hidden=\"true\" class=\"anchor\" id=\"searching-for-features\"></a>Searching for features</h1>\n<p>So, now you have a directory full of ZIP files.\nEach of which has a bunch of files necessary to interpret the shapefile.\nIsn't GIS lovely?</p>\n<p>The following script will let you write a simple &quot;WHERE&quot; clause,\nfiltering the data exactly as it comes from the Census Bureau!</p>\n<pre><code class=\"language-bash\">#!/bin/bash\nset -e;\n\nfind &quot;$1&quot; -type f -iname &quot;*.zip&quot; -print0 |\\\n  while IFS= read -r -d $'\\0' filename; do\n\n    filtered_json=$(ogr2ogr -f GeoJSON -t_srs crs:84 -where &quot;$2&quot; /vsistdout/ /vsizip/$filename);\n    # Check if the filtered GeoJSON has any features\n    feature_count=$(echo &quot;$filtered_json&quot; | jq '.features | length')\n\n    if [ &quot;$feature_count&quot; -gt 0 ]; then\n      # echo filename to stderr\n      &gt;&amp;2 echo $(date -u) &quot;Match(es) found in $filename&quot;;\n      echo &quot;$filtered_json&quot;;\n    fi\n\n  done;\n</code></pre>\n<p>You can run it like so:</p>\n<pre><code class=\"language-shell\">./find-tiger-features.sh $HOME/Downloads/tiger-2021/ &quot;TFIDL = 213297979 OR TFIDR = 213297979&quot;\n</code></pre>\n<p>This ends up being a LOT easier and faster than QGIS in my experience\nif you want to search for specific known attributes.\nEspecially if you don't know the specific area that you're looking for.\nI was surprised that so such tool for things like ID lookps existed already!</p>\n<p>Note that this isn't exactly &quot;fast&quot; by typical data processing workload standards.\nIt takes around 10 minutes to run on my laptop.\nBut it's a lot faster than the alternatives in many circumstances,\nespecilaly if you don't know exactly which file the data is in!</p>\n<p>For details on the fields available,\nrefer to the technical documentation on the <a href=\"https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html\">Census Bureau website</a>.</p>\n",
      "summary": "",
      "date_published": "2024-11-09T00:00:00-00:00",
      "image": "",
      "authors": [
        {
          "name": "Ian Wagner",
          "url": "https://fosstodon.org/@ianthetechie",
          "avatar": "media/avi.jpeg"
        }
      ],
      "tags": [
        "gis",
        "shell",
        "ogr2ogr"
      ],
      "language": "en"
    },
    {
      "id": "https://ianwwagner.com//copying-and-unarchiving-from-a-server-without-a-temp-file.html",
      "url": "https://ianwwagner.com//copying-and-unarchiving-from-a-server-without-a-temp-file.html",
      "title": "Copying and Unarchiving From a Server Without a Temp File",
      "content_html": "<p>Sometimes I want to copy files from a remote machine--usually a server I control.\nEasy; just use <code>scp</code>, right?</p>\n<p>Well, today I had a subtly different twist to the usual problem.\nI needed to transfer a ~100GB tarball to my local machine,\nand I really wanted to unarchive it so that I could get at the internal data directly.\nAnd I wanted to do it in one step, since I didn't have 200GB of free space.</p>\n<p>I happened to remember that this <em>should</em> be possible with pipes or something.\nThe tarball format is helpfully designed to allow streaming.\nBut it took me a bit to come up with the right set of commands to do this.</p>\n<p><code>scp</code> is really designed for dumping to files first.\nI found a few suggestions on StackOverflow that looked like they might work,\nbut didn't for me (might have been my shell? I use <code>fish</code> rather than <code>bash</code>).\nBut I noticed that almost all of the answers recommended using <code>ssh</code> instead,\nsince it's a bit more suited to the purpose.</p>\n<p>The basic idea is to dump the file to standard out on the remote host,\nthen pipe the ssh output into <code>tar</code> locally.\nThe <code>tar</code> flags are probably familiar or easily understandable: <code>-xvf -</code> in my case.\nThis puts <code>tar</code> into extract mode,\nenables verbose logging (so you see its progress),\nand tells it to read from stdin (<code>-</code>).\nMy <em>tarball</em> was not compressed.\nIf yours is, add the appropriate decompression flags.</p>\n<p>The SSH flags were a bit trickier.\nI discovered the <code>-C</code> flag, which enables gzip compression.\nI happen to know this dataset compresses well with gzip,\nand further that the network link between me and the remote is not the best,\nso I enabled it.\nDon't use this if your data does not compress well,\nor if it is already compressed.</p>\n<p>Another flag, <code>-e none</code>,\nI found via <a href=\"https://www.unix.com/unix-for-dummies-questions-and-answers/253941-scp-uncompress-file.html\">this unix.com forum post</a>.\nThis seemed like a good thing to enable after some research,\nsince sequences like <code>~.</code> will not be interpreted as &quot;kill the session.&quot;\nIt also prevents more subtle bugs which would look like data corruption.</p>\n<p><code>-T</code> was suggested after I pressed ChatGPT o1-preview for other flags that might be helpful.\nIt just doesn't allocate a pseudo-terminal.\nWhich we didn't need anyways.\n(Aside: ChatGPT 4o will give you some hot garbage suggestions; o1-preview was only helpful in suggesting refinements.)</p>\n<p>Finally, the command executes <code>cat</code> on the remote host to dump the tarball to <code>stdout</code>.\nI saw suggestions as well to use <code>dd</code> since you can set the block size explicitly.\nThat might improve perf in some situations if you know your hardware well.\nOr it might just be a useless attempt at premature optimization ;)</p>\n<p>Here's the final command:</p>\n<pre><code class=\"language-shell\">ssh -C -e none -T host.example.com 'cat /path/to/archive.tar' | tar -xvf -\n</code></pre>\n",
      "summary": "",
      "date_published": "2024-10-28T00:00:00-00:00",
      "image": "",
      "authors": [
        {
          "name": "Ian Wagner",
          "url": "https://fosstodon.org/@ianthetechie",
          "avatar": "media/avi.jpeg"
        }
      ],
      "tags": [
        "ssh",
        "terminal",
        "shell",
        "tar"
      ],
      "language": "en"
    }
  ]
}