perrotuerto.blog/old/content/html/en/005_hiim-master.html

392 lines
45 KiB
HTML
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

<!DOCTYPE html>
<html lang="en">
<head>
<title>How It Is Made: Master Research Thesis</title>
<meta charset="utf-8" />
<meta name="application-name" content="Publishing is Coding: Change My Mind">
<meta name="description" content="Blog about free culture, free software and free publishing.">
<meta name="keywords" content="publishing, blog, book, ebook, methodology, foss, libre-software, format, markdown, html, epub, pdf, mobi, latex, tex, culture, free culture, philosophy">
<meta name="viewport" content="width=device-width, user-scalable=0">
<link rel="shortcut icon" href="../../../icon.png">
<link rel="alternate" type="application/rss+xml" href="https://perrotuerto.blog/feed/" title="Publishing is Coding: Change My Mind">
<link type="text/css" rel="stylesheet" href="../../../css/styles.css">
<link type="text/css" rel="stylesheet" href="../../../css/extra.css">
<script type="application/javascript" src="../../../js/functions.js"></script>
</head>
<body>
<header>
<h1><a href="https://perrotuerto.blog/content/html/en/">Publishing is Coding: Change My Mind</a></h1>
<nav> <p> <a href="../../../content/html/en/_links.html">Links</a> | <a href="../../../content/html/en/_about.html">About</a> | <a href="../../../content/html/en/_contact.html">Contact</a> | <a href="../../../content/html/en/_fork.html">Fork</a> | <a href="../../../content/html/en/_donate.html">Donate</a> </p>
</nav>
</header>
<div id="controllers">
<a onclick="zoom(true)">+</a>
<a onclick="zoom(false)"></a>
<a onclick="mode(this)">N</a>
</div>
<section>
<h1 id="how-it-is-made-master-research-thesis">How It Is Made: Master Research Thesis</h1>
<blockquote class="published">
<p>Published: 2020/02/15, 13:00 | <a href="http://zines.perrotuerto.blog/pdf/005_hiim-master_en.pdf"><span class="smallcap">PDF</span></a> | <a href="http://zines.perrotuerto.blog/pdf/005_hiim-master_en_imposition.pdf"><span class="smallcap">Booklet</span></a></p>
</blockquote>
<p>Uff, after six months of writing, reviewing, deleting, yelling and almost giving up, I finally finished the Master's research thesis. You can check it out <a href="https://maestria.perrotuerto.blog">here</a>.</p>
<p>The thesis is about intellectual property, commons and cultural and philosophical production. I completed the Master's of Philosophy at the National Autonomous University of Mexico (<span class="smallcap">UNAM</span>). This research was written in Spanish and it consists of almost 27K words and ~100 pages.</p>
<p>Since the beginning, I decided not to write it with a text processor such as <a href="https://www.libreoffice.org">LibreOffice</a> nor Microsoft Office. I made that decision because:</p>
<ul>
<li>
<p>Office software was designed for a particular kind of work, not for research purposes.</p>
</li>
<li>
<p>Bibliography managing or reviewing the writing could be very very messy.</p>
</li>
<li>
<p>I needed several outputs which would require heavy clean up if I wrote the research in <span class="smallcap">ODT</span> or <span class="smallcap">DOCX</span> formats.</p>
</li>
<li>
<p>I wanted to see how far I could go by just using <a href="https://en.wikipedia.org/wiki/Markdown">Markdown</a>, a terminal and <a href="https://en.wikipedia.org/wiki/Free_and_open-source_software"><span class="smallcap">FOSS</span></a>.</p>
</li>
</ul>
<p>In general the thesis is actually an automated repository where you can see everything—including the entire bibliography, the site and the writing history. The research uses a <a href="https://en.wikipedia.org/wiki/Rolling_release">rolling release</a> model—“the concept of frequently delivering updates.” The methodology is based on automated and multiformat standardized publishing, or as I like to call it: branched publishing.</p>
<p>This isn't the space to discuss the method, but these are some general ideas:</p>
<ul>
<li>
<p>We have some inputs which are our working files.</p>
</li>
<li>
<p>We need several outputs which would be our ready-to-ship files.</p>
</li>
<li>
<p>We want automation so we only focus on writing and editing, instead of losing our time in formatting or having nightmares with layout design.</p>
</li>
</ul>
<p>In order to be successful, it's necessary to avoid any kind of <a href="https://en.wikipedia.org/wiki/WYSIWYG"><span class="smallcap">WYSIWYG</span></a> and <a href="https://en.wikipedia.org/wiki/Desktop_publishing">Desktop Publishing</a> approaches. Instead, branched publishing employs <a href="https://en.wikipedia.org/wiki/WYSIWYM"><span class="smallcap">WYSIGYM</span></a> and typesetting systems.</p>
<p>So let's start!</p>
<h2 id="inputs">Inputs</h2>
<p>I have two main input files: the content of the research and the bibliography. I used Markdown for the content. I decided to use <a href="https://www.overleaf.com/learn/latex/Articles/Getting_started_with_BibLaTeX">BibLaTeX</a> for the bibliography.</p>
<h3 id="markdown">Markdown</h3>
<p>Why Markdown? Because it is:</p>
<ul>
<li>
<p>easy to read, write and edit</p>
</li>
<li>
<p>easy to process</p>
</li>
<li>
<p>a lightweight format</p>
</li>
<li>
<p>a plain and open format</p>
</li>
</ul>
<p>Markdown format was intended for blog writing. So “vanilla” Markdown isn't enough for research or scholarly writing. And I'm not a fan of <a href="https://pandoc.org/MANUAL.html#pandocs-markdown">Pandoc's Markdown</a>.</p>
<p>Don't get me wrong, <a href="https://pandoc.org">Pandoc</a> <i>is</i> the Swiss knife for document conversion, its name suits it perfectly. But for the type of publishing I do, Pandoc is part of the automation process and not for inputs or outputs. I use Pandoc as a middleman for some formats as it helps me save a lot of time.</p>
<p>For inputs and output formats I think Pandoc is a great general purpose tool, but not enough for a fussy publisher like this <i>perro</i>. Plus, I love scripting so I prefer to employ my time on that instead of configuring Pandoc's outputs—it helps me learn more. So in this publishing process, Pandoc is used when I haven't resolved something or I'm too lazy to do it, <span class="smallcap">LOL</span>.</p>
<p>Unlike text processing formats as <span class="smallcap">ODT</span> or <span class="smallcap">DOCX</span>, <span class="smallcap">MD</span> is very easy to customize. You don't need to install plugins, rather you just generate more syntax!</p>
<p>So <a href="http://pecas.perrotuerto.blog/html/md.html">Pecas' Markdown</a> was the base format for the content. The additional syntax was for citing the bibliography by its id.</p>
<figure>
<img src="../../../img/p005_i001.png" alt="The research in its original MD input."/>
<figcaption>
The research in its original <span class="smallcap">MD</span> input.
</figcaption>
</figure>
<h3 id="biblatex">BibLaTeX</h3>
<p>Formatting a bibliography is one of the main headaches for many researchers. It requires a lot of time and energy to learn how to quote and cite. And no matter how much experience one may have, the references or the bibliography usually have typos.</p>
<p>I know it by experience. Most of our clients' bibliographies are a huge mess. But 99.99% percent of the time it's because they do it manually… So I decided to avoid that hell.</p>
<p>They are several alternatives for bibliography formatting and the most common one is BibLaTeX, the successor of <a href="https://en.wikipedia.org/wiki/BibTeX">BibTeX</a>. With this type of format you can arrange your bibliography as an object notation. Here is a sample of an entry:</p>
<pre class="">
<code class="code-line-1">@book{proudhon1862a,</code><code class="code-line-2"> author = {Proudhon, Pierre J.},</code><code class="code-line-3"> date = {1862},</code><code class="code-line-4"> file = {:recursos/proudhon1862a.pdf:PDF},</code><code class="code-line-5"> keywords = {prio2,read},</code><code class="code-line-6"> publisher = {Office de publicité},</code><code class="code-line-7"> title = {Les Majorats littéraires},</code><code class="code-line-8"> url = {http://alturl.com/fiubs},</code><code class="code-line-9">}</code>
</pre>
<p>At the beginning of the entry you indicate its type and id. Each entry has an array of key-value pairs. Depending on the type of reference, there are some mandatory keys. If you need more, you can just add them in. This could be very difficult to edit directly because <span class="smallcap">PDF</span> compilation doesn't tolerate syntax errors. For comfort, you can use some <span class="smallcap">GUI</span> like <a href="https://www.jabref.org">JabRef</a>. With this software you can easily generate, edit or delete bibliographic entries as if they were rows in a spreadsheet.</p>
<p>So I have two types of input formats: <span class="smallcap">BIB</span> for bibliography and <span class="smallcap">MD</span> for content. I make cross-references by generating some additional syntax that invokes bibliographic entries by their id. It sounds complicated, but for writing purposes it's just something like this:</p>
<blockquote>
<p>@textcite[someone2020a] states… Now I am paraphrasing someone so I would cite her at the end @parencite[someone2020a].</p>
</blockquote>
<p>When the bibliography is processed I get something like this:</p>
<blockquote>
<p>Someone (2020) states… Now I am paraphrasing someone so I would cite her at the end (Someone, 2020).</p>
</blockquote>
<p>This syntax is based on LaTeX textual and parenthetical citations styles for <a href="http://tug.ctan.org/info/biblatex-cheatsheet/biblatex-cheatsheet.pdf">BibLaTeX</a>. The at sign (<code>@</code>) is the character I use at the beginning of any additional syntax for Pecas' Markdown. For processing purposes I could use any other kind of syntax. But for writing and editing tasks I found the at sign to be very accessible and easy to find.</p>
<p>The example was very simple and doesn't fully explore the point of doing this. By using ids:</p>
<ul>
<li>
<p>I don't have to worry if the bibliographic entries change.</p>
</li>
<li>
<p>I don't have to learn any citation style.</p>
</li>
<li>
<p>I don't have to write the bibliography section, it is done automatically!</p>
</li>
<li>
<p>I <i>always</i> get the correct structure.</p>
</li>
</ul>
<p>In a further section I explain how this process is possible. The main idea is that with some scripts these two inputs became one, a Markdown file with an added bibliography, ready for automation processes.</p>
<h2 id="outputs">Outputs</h2>
<p>I hate <span class="smallcap">PDF</span> as the only research output, because most of the time I made a general reading on screen and, if I wanted a more detailed reading, with notes and shit, I prefer to print it. It isn't comfortable to read a <span class="smallcap">PDF</span> on screen and most of the time printed <span class="smallcap">HTML</span> or ebooks are aesthetically unpleasant. That's why I decided to deliver different formats, so readers can pick what they like best.</p>
<p>Seeing how publishing is becoming more and more centralized, unfortunately the deployment of <span class="smallcap">MOBI</span> formats for Kindle readers is recommendable—by the way, <span class="smallcap">FUCK</span> Amazon, they steal from writers and publishers; use Amazon only if the text isn't in another source. I don't like proprietary software as Kindlegen, but it is the only <i>legal</i> way to deploy <span class="smallcap">MOBI</span> files. I hope that little by little Kindle readers at least start to hack their devices. Right now Amazon is the shit people use, but remember: if you don't have it, you don't own it. Look what happened with <a href="https://www.npr.org/2019/07/07/739316746/microsoft-closes-the-book-on-its-e-library-erasing-all-user-content">books in Microsoft Store</a></p>
<p>What took the cake was a petition from my tutor. He wanted an editable file he could use easily. Long ago Microsoft monopolized ewriting, so the easiest solution is to provide a <span class="smallcap">DOCX</span> file. I would prefer to use <span class="smallcap">ODT</span> format but I have seen how some people don't know how to open it. My tutor isn't part of that group, but for the outputs it's good to think not only in what we need but in what we could need. People barely read research, if it isn't accessible in what they already know, they won't read.</p>
<p>So, the following outputs are:</p>
<ul>
<li>
<p><span class="smallcap">EPUB</span> as standard ebook format.</p>
</li>
<li>
<p><span class="smallcap">MOBI</span> for Kindle readers.</p>
</li>
<li>
<p><span class="smallcap">PDF</span> for printing.</p>
</li>
<li>
<p><span class="smallcap">HTML</span> for web surfers.</p>
</li>
<li>
<p><span class="smallcap">DOCX</span> as editable file.</p>
</li>
</ul>
<h3 id="ebooks">Ebooks</h3>
<figure>
<img src="../../../img/p005_i002.png" alt="The research in its EPUB output."/>
<figcaption>
The research in its <span class="smallcap">EPUB</span> output.
</figcaption>
</figure>
<p>I don't use Pandoc for ebooks, instead I use a publishing tool we are developing: <a href="https://pecas.perrotuerto.blog">Pecas</a>. “Pecas” means “freckles,” but in this context it's in honor of a pinto dog from my childhood.</p>
<p>Pecas allows me to deploy <span class="smallcap">EPUB</span> and <span class="smallcap">MOBI</span> formats from <span class="smallcap">MD</span> plus document statistics, file validations and easy metadata handling. Each Pecas project can be heavily customized since it allows Ruby, Python or shell scripts. The main objective behind this is the ability to remake ebooks from recipes. Therefore, the outputs are disposable in order to save space and because you don't need them all the time and shouldn't edit final formats!</p>
<p>Pecas is rolling release software with <span class="smallcap">GNU</span> General Public License, so it's open, free and <i>libre</i> program. For a couple months Pecas has been unmaintained because this year we are going to start all over again, with cleaner code, easier installation and a bunch of new features—I hope, we need <a href="https://perrotuerto.blog/content/html/en/_donate.html">your support</a>.</p>
<h3 id="pdf">PDF</h3>
<p>For <span class="smallcap">PDF</span> output I rely on LaTeX and LuaLaTeX. Why? Just because it is what I'm used to. I don't have any particular argument against other frameworks or engines inside the TeX family. It's a world I still have to dig more into.</p>
<p>Why don't I use desktop publishing instead, like InDesign or Scribus? Outside of its own workflow, desktop publishing is hard to automate and maintain. This approach is great if you just want a <span class="smallcap">PDF</span> output or if you desire to work with a <span class="smallcap">GUI</span>. For file longevity and automated and multiformat standardized publishing, desktop publishing simply isn't the best option.</p>
<p>Why don't I just export a <span class="smallcap">PDF</span> from the <span class="smallcap">DOCX</span> file? I work in publishing, I still have some respect for my eyes…</p>
<p>Anyway, for this output I use Pandoc as a middleman. I could have managed the conversion from <span class="smallcap">MD</span> to <span class="smallcap">TEX</span> format with scripts, but I was lazy. So, Pandoc converts <span class="smallcap">MD</span> to <span class="smallcap">TEX</span> and LuaLaTeX compiles it into a <span class="smallcap">PDF</span>. I don't use both programs explicitly, instead I wrote a script in order to automate this process. In a further section I explain this.</p>
<figure>
<img src="../../../img/p005_i003.png" alt="The research in its PDF output; I don't like justified text, it's bad for your eyes."/>
<figcaption>
The research in its <span class="smallcap">PDF</span> output; I don't like justified text, it's bad for your eyes.
</figcaption>
</figure>
<h3 id="html">HTML</h3>
<p>The <span class="smallcap">EPUB</span> format is actually a bunch of compressed <span class="smallcap">HTML</span> files plus metadata and a table of contents. So there is no reason to avoid a <span class="smallcap">HTML</span> output. I already have it by converting the <span class="smallcap">MD</span> with Pecas. I don't think someone is gonna read 27K words in a web browser, but you never know. It could work for a quick look.</p>
<h3 id="docx">DOCX</h3>
<p>This output doesn't have anything special. I didn't customize its styles. I just use Pandoc via another script. Remember, this file is for editing so its layout doesn't really matter.</p>
<h2 id="writing">Writing</h2>
<p>Besides the publishing method used in this research, I want to comment on some particularities about the influence of the technical setup over the writing.</p>
<h3 id="text-editors">Text Editors</h3>
<p>I never use word processors, so writing this thesis wasn't an exception. Instead, I prefer to use text editors. Between them I have a particular taste for the most minimalist ones like <a href="https://en.wikipedia.org/wiki/Vim_(text_editor)">Vim</a> or <a href="https://en.wikipedia.org/wiki/Gedit">Gedit</a>.</p>
<p>Vim is a terminal text editor. I use it on a regular basis—sorry <a href="https://en.wikipedia.org/wiki/Emacs">Emacs</a> folks. I write almost everything, including this thesis, with Vim because of its minimalist interface. No fucking buttons, no distractions, just me and the black-screen terminal.</p>
<p>Gedit is a <span class="smallcap">GUI</span> text editor and I use it mainly for <a href="https://en.wikipedia.org/wiki/Regular_expression">RegEx</a> or searches. In this project I utilized it for quick references to the bibliography. I like JabRef as a bibliography manager, but for getting the ids I just need access to the raw <span class="smallcap">BIB</span> file. Gedit was a good companion for that particular job because its lack of “buttonware”—the annoying tendency to put buttons everywhere.</p>
<h3 id="citations">Citations</h3>
<p>I want the research to be as accessible as possible. I didn't want to use a complicated citation style. That's why I only used parenthetical and textual citations.</p>
<p>This could be an issue for many scholars. But when I see typos in their complex citations and quotations, I don't have any empathy. If you are gonna add complexity to your work, the least you can do is to do it right. And let's be honest, most scholars add complexity because they want to make themselves look good—i.e. they conform with formation rules for research texts in order to be part of a community or “gain” some objectivity.</p>
<h3 id="block-quotations">Block Quotations</h3>
<p>You are not going to see any block quotes in the research. This isn't only because of accessibility—some people can't distinguish these types of quotes—but the ways in which the bibliography was handled.</p>
<p>One of the main purposes for block quotations is to provide a first and extended hand of what point a writer is making. But sometimes it's also used as text filling. In a common way to do research in Philosophy, the output tends to be a “final” paper. That text is the research plus the bibliography. This format doesn't allow to embed any other files, like papers, websites, books or data bases. If you want to provide some literal information, quotes and block quotes are the way to go.</p>
<p>Because this thesis is actually an automated repository, it contains all the references used for the research. It has a bibliography, but also each quoted work for backup and educational purposes. Why would I use block quotes if you could easily check the files? Even better, you could use some search function or go over all the data for validation purposes.</p>
<p>Moreover, the university doesn't allow long submission. I agree with that, I think we have other technical capabilities that allow us to be more synthetic. By putting aside block quotes, I had more space for the actual research.</p>
<p>Take it or leave it, research as repository and not as a file gives us more possibilities for accessibility, portability and openness.</p>
<h3 id="footnotes">Footnotes</h3>
<p>Oh, the footnotes! Such a beautiful technique for displaying side text. It works great, it permits metawriting and so on. But it works as expected if the output you are thinking of is, firstly, a file and secondly, a text with fixed layout. In other types of outputs, footnotes can be a nightmare.</p>
<p>I have the conviction that most footnotes can be incorporated into the text. This is due to three personal experiences. During my undergraduate and graduate studies, as a Philosophy student we had to read a lot of fucking critical editions, which tend to have their “critical” notes as footnotes. For these types of text I get it, people don't want to confuse their words for someone else's, less if it's between a philosophical authority and a contemporary philosopher—take note that it's a personal taste and not a mandate. But this is a shitty Master's research thesis, not a critical edition.</p>
<p>I used to hate footnotes, now I just dislike them. Part of my job is to review, extract and fix other peoples' footnotes. I can bet you that half of the time footnotes aren't properly displayed or they are missing. Commonly this is not a software error. Sometimes it's because people do them manually. But I won't blame publishers nor designers for their mistakes. The way things are developing in publishing, most of the time the issue is the lack of time. We are being pushed to publish books as fast as we can and one of the side effects of that is the loss of quality. Bibliography, footnotes and block quotes are the easiest way to find out how much care has gone into a text.</p>
<p>I do blame some authors for this mess. I repeat, it is just a personal experience, but in my work I have seen that most authors put footnotes in the following situations:</p>
<ul>
<li>
<p>They want to add shit but not to rewrite shit.</p>
</li>
<li>
<p>They aren't very good writers or they are in a hurry, so footnotes are the way to go.</p>
</li>
<li>
<p>They think that by adding footnotes, block quotes or references they can “earn” objectivity.</p>
</li>
</ul>
<p>I think the thesis needs more rewriting, I could have written things in a more comprehensive way, but I was done—writing philosophy is not my thing, I prefer to speak or program (!) it. That is why I took my time on the review process—ask my tutor about that, <span class="smallcap">LMFAO</span>. It would have been easier for me to just add footnotes, but it would have been harder for you to read that shit. Besides that, footnotes take more space than rewriting.</p>
<p>So, with respect to the reader and in agreement with the text extension of my university, I decided not to use footnotes.</p>
<h2 id="programming">Programming</h2>
<p>As you can see, I had to write some scripts and use third party software in order to have a thesis as an automated repository. It sounds difficult or perhaps like nonsense, but, doesn't Philosophy have that kind of reputation, anyway? >:)</p>
<h3 id="md-tools">MD Tools</h3>
<p>The first challenges I had were:</p>
<ul>
<li>
<p>I needed to know exactly how many pages I had written.</p>
</li>
<li>
<p>I wanted an easier way to beautify <span class="smallcap">MD</span> format.</p>
</li>
<li>
<p>I had to make some quality checks in my writing.</p>
</li>
</ul>
<p>Thus, I decided to develop some programs for these tasks: <a href="https://gitlab.com/snippets/1917485"><code>texte</code></a>, <a href="https://gitlab.com/snippets/1917487"><code>texti</code></a> and <a href="https://gitlab.com/snippets/1917488"><code>textu</code></a>, respectively.</p>
<p>These programs are actually Ruby scripts that I put on my <code>/usr/local/bin</code> directory. You can do the same, but I wouldn't recommended it. Right now in Programando <span class="smallcap">LIBRE</span>ros we are refactoring all that shit so they can be shipped as a Ruby gem. So I recommend waiting.</p>
<p>With <code>texte</code> I am able to know the number of lines, characters, characters without spaces, words and three different page sizes: by every 1.800 characters with spaces, by every 250 words and an average of both—you can set other lengths for page sizes.</p>
<p>The <span class="smallcap">MD</span> beautifier is <code>texti</code>. For the moment it only works well with paragraphs. It was good enough for me, my issue was with the disparate length of lines—yeah, I don't use line wrap.</p>
<figure>
<img src="../../../img/p005_i004.png" alt="texti sample help display."/>
<figcaption>
<code>texti</code> sample help display.
</figcaption>
</figure>
<p>I also tried to avoid some typical mistakes while using quotation marks or brackets: sometimes we forget to close them. So <code>textu</code> is for this quality check.</p>
<p>These three programs were very helpful for my writing, that is why we decided to continue in its development as a Ruby gem. For our work and personal projects, <span class="smallcap">MD</span> is our main format, so we are obligated to provide tools that help writers and publishers also using Markdown.</p>
<h3 id="baby-biber">Baby Biber</h3>
<p>If you are into TeX family, you probably know <a href="https://en.wikipedia.org/wiki/Biber_(LaTeX)">Biber</a>, the bibliography processing program. With Biber we are able to compile bibliographic entries of BibLaTeX in <span class="smallcap">PDF</span> outputs and carry out checks or clean ups.</p>
<p>I started to have issues with the references because our publishing method implies the deployment of outputs in separate processes from the same inputs, in this case <span class="smallcap">MD</span> and <span class="smallcap">BIB</span> formats. With Biber I was able to add the bibliographic entries but only for <span class="smallcap">PDF</span>.</p>
<p>The solution I came to was the addition of references in <span class="smallcap">MD</span> before any other process. In doing this, I merged the inputs in one <span class="smallcap">MD</span> file. This new file is used for the deployment of all the outputs.</p>
<p>This solution implies the use of Biber as a clean up tool and the development of a program that processes bibliographic entries of BibLaTeX inside Markdown files. <a href="https://gitlab.com/snippets/1917492">Baby Biber</a> is this program. I wanted to honor Biber and make clear that this program is still in its baby stages.</p>
<p>What does Baby Biber do?</p>
<ul>
<li>
<p>It generates a new <span class="smallcap">MD</span> file with references and bibliography.</p>
</li>
<li>
<p>It adds references if the original <span class="smallcap">MD</span> file calls to <code>@textcite</code> or <code>@parencite</code> with a correct BibLaTeX id.</p>
</li>
<li>
<p>It adds the bibliography to the end of the document according to the called references.</p>
</li>
</ul>
<p>One headache with references and bibliography styles is how to customize them. With Pandoc you can use <a href="https://github.com/jgm/pandoc-citeproc"><code>pandoc-citeproc</code></a> which allows you to select any style written in <a href="https://en.wikipedia.org/wiki/Citation_Style_Language">Citation Style Language (<span class="smallcap">CSL</span>)</a>. These styles are in <span class="smallcap">XML</span> and it is a serious thing: you should apply these standards. You can check different <span class="smallcap">CSL</span> citation styles in its <a href="https://github.com/citation-style-language/styles">official repo</a>.</p>
<p>Baby Biber doesn't support <span class="smallcap">CSL</span>! Instead, it uses <a href="https://en.wikipedia.org/wiki/YAML"><span class="smallcap">YAML</span></a> format for <a href="https://gitlab.com/snippets/1917513">its configuration</a>. This is because of two issues:</p>
<ol>
<li>
<p>I didn't take the time to read how to implement <span class="smallcap">CSL</span> citation styles.</p>
</li>
<li>
<p>My University allows me to use any kind of citation style as long as it has uniformity and displays the information in a clear manner.</p>
</li>
</ol>
<p>So, yeah, I have a huge debt here. And maybe it will stay like that. The new version of Pecas will implement and improve the work done by Baby Biber—I hope.</p>
<figure>
<img src="../../../img/p005_i005.png" alt="Baby Biber sample config file."/>
<figcaption>
Baby Biber sample config file.
</figcaption>
</figure>
<h3 id="pdf-exporter">PDF exporter</h3>
<p>The last script I wrote is for the automation of <span class="smallcap">PDF</span> compilation with LuaLaTeX and Biber (optionally).</p>
<p>I don't like the default layouts of Pandoc and I could have read the docs in order to change that behavior, but I decided to experiment a bit. The new version of Pecas will implement <span class="smallcap">PDF</span> outputs, so I wanted to play around a little with the formatting, as I did with Baby Biber. Besides, I needed a quick program for <span class="smallcap">PDF</span> outputs because we publish sometimes <a href="http://zines.perrotuerto.blog/">fanzines</a>.</p>
<p>So, <a href="https://gitlab.com/snippets/1917490"><code>export-pdf</code></a> is the experiment. It uses Pandoc to convert <span class="smallcap">MD</span> to <span class="smallcap">TEX</span> files. Then it does some clean up and injects the template. Finally, it compiles the <span class="smallcap">PDF</span> with LuaLaTeX and Biber—if you want to add the bibliographic entries this way. It also exports a <span class="smallcap">PDF</span> booklet with <code>pdfbook2</code>, but I don't deploy it in this repo because the <span class="smallcap">PDF</span> is letter size, too large for a booklet.</p>
<p>I have a huge debt here that I won't pay. It is cool to have a program for <span class="smallcap">PDF</span> outputs that I understand, but I still want to experiment with <a href="https://en.wikipedia.org/wiki/ConTeXt">ConTeXt</a>.</p>
<p>I think ConTeXt could be a useful tool while using <span class="smallcap">XML</span> files for <span class="smallcap">PDF</span> outputs. I defend Markdown as input format for writers and publishers, but for automation <span class="smallcap">XML</span> format is way better. For the new version of Pecas I have been thinking about the possibility of using <span class="smallcap">XML</span> for any kind of standard output like <span class="smallcap">EPUB</span>, <span class="smallcap">PDF</span> or <span class="smallcap">JATS</span>. I have problems with <span class="smallcap">TEX</span> format because it generates an additional format just for one output, why would I allow it if <span class="smallcap">XML</span> can provide me with at least three outputs?</p>
<figure>
<img src="../../../img/p005_i006.png" alt="export-pdf Ruby code."/>
<figcaption>
<code>export-pdf</code> Ruby code.
</figcaption>
</figure>
<h3 id="third-parties">Third parties</h3>
<p>I already mentioned the third party software I used for this repo:</p>
<ul>
<li>
<p>Vim as a main text editor.</p>
</li>
<li>
<p>Gedit as a side text editor.</p>
</li>
<li>
<p>JabRef as a bibliography manager.</p>
</li>
<li>
<p>Pandoc as a document converter.</p>
</li>
<li>
<p>LuaLaTeX as a <span class="smallcap">PDF</span> engine.</p>
</li>
<li>
<p>Biber as a bibliography cleaner.</p>
</li>
</ul>
<p>The tools I developed and this software are all <span class="smallcap">FOSS</span>, so you can use them if you want without paying or asking for permission—and without warranty xD</p>
<h2 id="deployment">Deployment</h2>
<p>There is a fundamental design issue in this research as automated repository: I should have put all the scripts in one place. At the beginning of the research I thought it would be easier to place each script side by side its input or output. Over time I realized that it wasn't a good idea.</p>
<p>The good thing is that there is one script that works as a <a href="https://en.wikipedia.org/wiki/Wrapper_function">wrapper</a>. You don't really have to know anything about it. You just write the research in Markdown, fill the BibLaTeX bibliography and any time you want or your server is configured, call that script.</p>
<p>This is a simplified listing showing the places of each script, inputs and outputs inside the repo:</p>
<pre class="">
<code class="code-line-1">.</code><code class="code-line-2">├─ [01] bibliografia</code><code class="code-line-3">│   ├─ [02] bibliografia.bib</code><code class="code-line-4">│   ├─ [03] bibliografia.html</code><code class="code-line-5">│   ├─ [04] clean.sh</code><code class="code-line-6">│   ├─ [05] config.yaml</code><code class="code-line-7">│   └─ [06] recursos</code><code class="code-line-8">├─ [07] index.html</code><code class="code-line-9">└─ [08] tesis</code><code class="code-line-10"> ├─ [09] docx</code><code class="code-line-11"> │   ├─ [10] generate</code><code class="code-line-12"> │   └─ [11] tesis.docx</code><code class="code-line-13"> ├─ [12] ebooks</code><code class="code-line-14"> │   ├─ [13] generate</code><code class="code-line-15"> │   └─ [14] out</code><code class="code-line-16"> │   ├─ [15] generate.sh</code><code class="code-line-17"> │   ├─ [16] meta-data.yaml</code><code class="code-line-18"> │   ├─ [17] tesis.epub</code><code class="code-line-19"> │   └─ [18] tesis.mobi</code><code class="code-line-20"> ├─ [19] generate-all</code><code class="code-line-21"> ├─ [20] html</code><code class="code-line-22"> │ ├─ [21] generate</code><code class="code-line-23"> │ └─ [22] tesis.html</code><code class="code-line-24"> ├─ [23] md</code><code class="code-line-25"> │   ├─ [24] add-bib</code><code class="code-line-26"> │   ├─ [25] tesis.md</code><code class="code-line-27"> │   └─ [26] tesis_with-bib.md</code><code class="code-line-28"> └─ [27] pdf</code><code class="code-line-29"> ├─ [28] generate</code><code class="code-line-30"> └─ [29] tesis.pdf</code>
</pre>
<h3 id="bibliography-pathway">Bibliography pathway</h3>
<p>Even with a simplified view you can see how this repo is a fucking mess. The bibliography [01] and the thesis [08] are the main directories in this repo. As a sibling you have the website [07].</p>
<p>The bibliography directory isn't part of the automation process. I worked on the <span class="smallcap">BIB</span> file [02] in different moments than my writing. I exported it to <span class="smallcap">HTML</span> [03] every time I used JabRef. This <span class="smallcap">HTML</span> is for queries from the browser. Over there it's also a simple script [04] to clean the bibliography with Biber and the configuration file [05] for Baby Biber. Are you a data hoarder? There is an special directory [06] for you with all the works used for this research ;)</p>
<h3 id="engine-on">Engine on</h3>
<p>In the thesis directory [08] is where everything moves smoothly when you call to <code>generate-all</code> [19], the wrapper that turns on the engine!</p>
<p>The wrapper does the following steps:</p>
<ol>
<li>
<p>It adds the bibliography [24] to the original <span class="smallcap">MD</span> file [25], leaving a new file [26] to act as input.</p>
</li>
<li>
<p>It generates [21] the <span class="smallcap">HTML</span> output [22].</p>
</li>
<li>
<p>It compiles [28] the <span class="smallcap">PDF</span> output [29].</p>
</li>
<li>
<p>It generates [13] the <span class="smallcap">EPUB</span> [17] and <span class="smallcap">MOBI</span> [18] according to their metadata [16] and Pecas config file [15].</p>
</li>
<li>
<p>It exports [10] the <span class="smallcap">MD</span> to <span class="smallcap">DOCX</span> [11].</p>
</li>
<li>
<p>It moves the analytics to its correct directory.</p>
</li>
<li>
<p>It refreshes the modification date in the index [07].</p>
</li>
<li>
<p>It puts the new rolling release' hash in the index.</p>
</li>
<li>
<p>It puts the <span class="smallcap">MD5</span> checksum of all outputs in the index.</p>
</li>
</ol>
<p>And that's it. The process of developing a thesis as a automate repository allows me to just worry about three things:</p>
<ol>
<li>
<p>Write the research.</p>
</li>
<li>
<p>Manage the bibliography.</p>
</li>
<li>
<p>Deploy all outputs automatically.</p>
</li>
</ol>
<h3 id="the-legal-stuff">The legal stuff</h3>
<p>That's how it works, but we still have to talk about how the thesis can <i>legally</i> be used…</p>
<p>This research was paid for by every Mexican through taxation. The National Council of Science and Technology (abbreviated Conacyt) granted me a scholarship to study a Master's in Philosophy at <span class="smallcap">UNAM</span>—yeah, American and British folks, more likely than not, we get paid here for our graduate studies.</p>
<p>This scholarship is a problematic privilege. So the least I can do in return is to liberate everything that was paid for by my homies and give free workshops and advice. I repeat: it is <i>the least</i> we can do. I disagree with using this privilege to live a lavish or party lifestyle only to then drop-out. In a country with many crises, scholarships are granted to improve your communities, not only you.</p>
<p>In general, I have the conviction that if you are a researcher or a graduate student and you already get paid—it doesn't matter if it's a salary or a scholarship, it doesn't matter if you are in a public or private university, it doesn't matter if you get the money from public or private administrations—you have a commitment with your community, with our species and with our planet. If you wanna talk about free labor and exploitation—which does happen—please look at the bottom. In this shitty world you are on the upper levels of this <a href="https://es.crimethinc.com/posters/capitalism-is-a-pyramid-scheme">nonsense pyramid</a>.</p>
<p>As a researcher, scientist, philosopher, theorist, artist and so on, you have an obligation to help other people. You can still feed your ego and believe you are the shit or the next <span class="smallcap">AAA</span> thinker, philosopher or artist. These two things doesn't overlap—but it's still annoying.</p>
<p>That is why this research has a <a href="https://wiki.p2pfoundation.net/Copyfarleft">copyfarleft</a> license for its content and a copyleft license for its code. Actually, it's the same licensing scheme of <a href="https://perrotuerto.blog/content/html/en/_fork.html">this blog</a>.</p>
<p>With <a href="https://leal.perrotuerto.blog/">Open and Free Publishing License</a> (abbreviated <span class="smallcap">LEAL</span>, that also means “loyal” in Spanish) you are free to use, copy, reedit, modify, share or sell any of this content under the following conditions:</p>
<ul>
<li>
<p>Anything produced with this content must be under some type of <span class="smallcap">LEAL</span>.</p>
</li>
<li>
<p>All files—editable or final formats—must be publicly accessible.</p>
</li>
<li>
<p>The content usage cannot imply defamation, exploitation or surveillance.</p>
</li>
</ul>
<p>You could remove my name and put yours, it's permmited. You could even modify the content and write that I <span class="smallcap">LOVE</span> intellectual property: there isn't a technical solution to avoid such defamation. But <span class="smallcap">MD5</span> checksum shows if the files were modify by others. Even if the files differs by one bit, the <span class="smallcap">MD5</span> checksum is gonna be different.</p>
<p>Copyfarleft is the way—but not the solution—that suits our context and our possibilities of freedom. Don't come here with your liberal and individualistic notion of freedom—like the dudes from <span id="weblate">Weblate</span> that kicked this blog out because its content license “is not free,” even though they say the code, but not the content, should use a “free” license, like the fucking <span class="smallcap">GPL</span> this blog has for its code. This type of liberal freedom doesn't work in a place where no State or corporation can warrant us a minimum set of individual freedoms, as it happens in Asia, Africa and the other America—Latin America and the America that isn't portrayed in the “American Dream” adds.</p>
<h2 id="last-thoughts">Last thoughts</h2>
<p>As a thesis works with a hypothesis, the technical and legal pathway of this research works with the possibility of having a thesis as an automated repository, instead of a thesis as a file. In the end, the possibility became a fact, but in a limited way.</p>
<p>I think that the idea of a thesis as a automated repo is doable and could be a better way for research deployment rather than uploading a single file. But this implementation contained many leaks that made it unsuitable for escalation.</p>
<p>Further work is necessary to be able to ship this as a standard practice. This technique could also be applied for automation and uniformity among publications, like papers in a journal or a book collection. The required labor isn't too much, and <i>maybe</i> it's something I would engage with during a PhD. But for right now, this is all that I can offer!</p>
<p>Thanks to <a href="https://twitter.com/hacklib">@hacklib</a> for pushing me to write this post and, again, thanks to my <span class="smallcap">S.O.</span> for persuading me to study a Master's degree and for reviewing this post. Thanks to my tutor, Mel and Gaby for their academic support. I can't forget to give thanks to <a href="https://hacklab.cc">Colima Hacklab</a>, <a href="https://ranchoelectronico.org">Rancho Electrónico</a> and <a href="https://t.me/miau2018">Miau Telegram Group</a> for their technical support. And also thanks to all the people and organizations I mentioned in the acknowledgment section of the research!</p>
<script type="text/javascript" src="../../../hashover/comments.php"></script>
</section>
<footer>
<p class="left no-indent">Texts and images are under <a href="../../../content/html/en/_fork.html">Open and Free Publishing License (<span class="smallcap">LEAL</span>)</a>.</p>
<p class="left no-indent">Code is under <a target="_blank" href="https://www.gnu.org/licenses/gpl-faq.en.html"><span class="smallcap">GNU</span> General Public License (<span class="smallcap">GPL</span>v3)</a>.</p>
<p class="left no-indent">Last build of this page: 2020/02/20, 10:23.</p>
<p class="left no-indent"><span class="smallcap"><a target="_blank" href="https://perrotuerto.blog/feed/en/rss.xml">RSS</a></span> | <a href="../../../content/html/en/005_hiim-master.html"><span class="versalita">EN</span></a> | <a href="../../../content/html/es/005_hiim-master.html"><span class="versalita">ES</span></a></p>
</footer>
</body>
</html>