Join us for our weekly series of short talks: nf-core/bytesize.

Just 15 minutes + questions, we focus on topics about using and developing nf-core pipelines. These are recorded and made available at https://nf-co.re , helping to build an archive of training material. Got an idea for a talk? Let us know on the #bytesize Slack channel!

This week, Phil Ewels (@ewels) will present: Code linting tools

Code linting tools are now an essential part of the nf-core development framework, and once you’re set up they’re super easy to use. Phil will tell us how to use Prettier to make your code, well, Prettier!

Video transcription
Note

The content has been edited to make it reader-friendly

0:01 Thank you very much everybody for joining, either joining us live or catching up on YouTube at a later date. Today’s bytesize is just a short one. It’s going to be about half slides and half live demo. I’m going to talk through some of the tools that we use for code linting, specifically code style linting. I’m not going to talk about the nf-core lint commands in this talk. There are other bytesize talk specifically about that. This is more about the general use tools that we use for linting code style.

0:38 What do I mean by that? So, as I’m sure anyone who’s done even a little bit of development work will know, how you format your code is a contentious issue. I started looking for a good gif or memes to put in, of course, XKCD came, gave up the goods with some rather excellent insults about code style. Once I started, there’s just… it’s a pretty deep trove. Actually, there’s a a lot of good comics out there about idiosyncrasies about code formatting. Any of you who watched Silicon Valley on TV or other TV programs will be familiar with the tabs v spaces argument. Anyway, the point being here is that everyone has quite individual views on how code should be formatted, not the content of the code, but just the white space and how things are formatted and how things are done.

1:33 This is where code formatters come in. The idea of code formatters or linters is that they set the style you’re going to use. They decide how things should be formatted. And that means that you don’t have to. It’s good because they mean that you free up a little bit of space in your brain for it. You’re not having to think about whether to use single quotes or double quotes, how much indentation you should be using, things like this, because it just happens. The other thing is that they’re really good for projects where there are lots of contributors like nf-core, because then we have lots of different people coming in and working on the same single code base. Everyone has their own idea of how things should be formatted. If ever you’ve come in and read someone else’s code, you’ll find that sometimes it’s very difficult to read through that code if it’s not in a style that you’re used to. Code formatters force everyone to write code in exactly the same way. That makes it much easier to collaborate together.

2:30 It also means that when you change things, the differences in the code, the code diffs that we see on GitHub are just the important things, the actual code that you’re changing. We don’t get lots of spurious diffs where white space is being changed. Code formatters think about style so that you don’t have to. I’ve put the little asterisks down in the corner here that they’re highly addictive. I think code formatters are a bit like wearing a seatbelt in a car, or gloves in a lab, or using Git to manage source control. Once you start using code linters, you’ll find it’s very difficult to work without them. I’ve been using them for a year or two now, and now I find it really irritating whenever I’m in a project that doesn’t have them. I notice always little inconsistencies and I find myself wasting time on thinking about this. Once you start using code linters, I suspect that many of you will find it difficult to go back. I hope so anyway.

3:29 These are the different languages within nf-core that you’ll come across, apart from the obvious Nextflow. We’ve got a bunch of Python code, especially in the nf-core tools package, also some scripts within pipelines. We’ve got a lot of documentation written in Markdown. We primarily use YAML for configuration files, also JSON. Then on the website, we’ve got a few extra tools, languages, websites written with PHP, CSS, and HTML. These are the main languages there that I’m going to talk about today. There’s not very many tools that we use. We use basically two tools, maybe three. Python, we format using a tool called Black, which is the most popular and commonly used code formatter for Python, used by a lot of big projects now. It was Black that got me into this in the first place.

4:29 Then we use a tool called Prettier to do everything else. Until recently, until the last release of tools, we used to use a tool called Markdown lint for Markdown and YAML lint for YAML. We had sort of numerous different individual lintes here, but we replaced those in the last major release of tools. Now we just use Prettier for everything. If you did a pipeline sync recently to a template, you’re seeing quite a lot of minor changes, things like quotes and things like that in your YAML files. That was because we switched to using Prettier and are now using that standard. Then the little mouse down in the bottom corner is a tool called EditorConfig, which is a standard where it tells your code editors what the code style is for a project. Both of these other tools tie into that config as well. It’s also a standalone tool that’s generic for any language. Now that one does stuff like indentation should be at multiples of two. If you try and indent any code with the multiple of three, then it will complain, stuff like that.

5:35 Okay, so these are the three projects we’re going to talk about and these are their websites. They all come with command line tools. They all come with a command line package that you can run, but they’re very similar. With Black, you can do black --check, prettier --check, and EditorConfig checker, that tool is specifically for checking. You just give it a bunch of files. Normally it would be asterisk for all of the different files in the project. Black and Prettier especially will just ignore any files that they don’t recognize. They will go through and they’ll check all the syntax of all of your different files and they’ll throw a warning if anything doesn’t conform to the style that they like.

6:17 What is particularly good about these tools and one of the main reasons we switched to Prettier recently is that these commands also will fix it for you. You don’t have to just go and look up every single line like we used to have to for markdown lints like line 432 is using the wrong style of bullet point. Now you can just do prettier --write, go, and Prettier will run through and it will format everything for you and just fix it. That’s great. That saves a lot of time.

6:44 Before I go on to the live demos, I’m just going to walk through some stuff. There’s three main ways that you’re likely to come into contact with these tools. For many of you, the first way you’ll come into contact with them is through the continuous integration tests. You’ll push a commit or you’ll open a pull request and you’ll have a failure on the CI. You’ll have the red cross and it will say that Prettier failed. It probably says something like, “oops, forgot to format your files with Prettier”. Usually pull requests will not be merged until that turns into a little green tick. That might be the first time you see this and that might be how you ended up on this talk. One of the first places that we might go to then to fix this problem is you can run these tools on the command line. Like I just showed, they have commands. Then you can go back to your code base, run prettier --write, which will fix the problem and then push another commit. Then hopefully that red cross will turn to a little tick. But really the best way to use these tools is to have them set up in your text editor, in your code editor. When that’s done properly, they will run automatically every time you save a file, every time you edit anything. You don’t ever have to think about running them because just everything is automatically formatted properly. That’s the best way to work really. Once you’ve set that up, you can just forget about this and it just works.

8:05 Right. Let’s see if I can screen share a little portion of my screen. Hopefully everyone can see that bit of screen. Zoom throws everything around my window. Wave at me if you can’t see a website that I’m looking at for the Black documentation. Good. Just to show you the websites for these tools, this is the Black documentation. You basically don’t ever need to look at this, but just so you know, this is what it looks like. If you Google Python Black, you’ll find it. This is a website for Prettier. Again, it talks about how it works and different languages it supports and the different code editors that it has integrations with, which is basically all of them. If you’re using something other than VSCode, which is what I’m going to show you in a minute, you can go along here and install a plugin for Atom or whatever else. It’s got quite good documentation.

9:24 The final one is EditorConfig, which is just, like I say, it’s a way of sort of standardizing config files across editors and projects for things like what type of new lines to use, character encoding, indent size use and stuff like this. Certainly Prettier will find these EditorConfig files and load these settings from that. They integrate together. A lot of editors have built in support for EditorConfig and some you need to plug in for. Again, you can come here and get your VIM plugin for EditorConfig. Basically, if this file is in your project, it will override your normal local settings.

10:03 Then I said it ran on continuous integration. Just to give you a look of where this is, this was within the pipeline template and there’s a file in the GitHub workflow called “linting.yaml”. Just so you can see, this is running EditorConfig here and this is the exact command it runs. If ever you want to emulate the CI tests running on GitHub, you can run these commands yourself on your local system and you should hopefully get the same results as long as you’ve got the same versions of the tools installed. You can see it’s just running Prettier check and EditorConfig. If we look in the main tools package where we’ve got a load of Python code, you’ll see it’s also running Black and you can see it’s running, it’s actually running a GitHub action for running Black.

11:00 Just to do a quick live demo now, I’ve got the RNA-Seq pipeline open here and I’ve just made a couple of basic dummy changes. If I do git diff, I wonder if I can do git dunk, it’s a new tool I found the other day. Okay. You can see here that I’ve just made a few changes, I’ve added something to a changelog file here and I’ve just made some white space changes to this YAML file just for the purposes of this demo. Now, right away, you can see that. These are the changes here, the GitHub changes, you can see it looks a bit weird. It’s probably not surprising that this is going to fail, but this is a valid YAML file. This is the nf-core config file. It’s customizing a bit about how the nf-core tools work, the linting tests and stuff. But all of this is valid YAML, I haven’t actually changed any of the real meaning of this file and it should have run fine. But you can see my indentation is a bit wonky here and I’m using some single quotes here and whatnot.

12:02 I’ve turned everything off at the moment, so if I hit save, then I can do prettier check. This will check all the YAML markdown files and sure enough, it says there’s something wrong here. The nf-core that YAML file has got a warning and also a changelog where I added a little note about what I’ve done here. I know something’s wrong and if I had pushed my code anyway, I would get a warning on GitHub from GitHub actions. I’m just going to commit this now. Now we’ll be able to see what changes after this. I’m in VSCode, so I can run Prettier, right? That goes through all the different files it recognizes. You can see if I do git status, that it has actually modified these two files. You’ll see I did git add first, so I committed those changes first. I think that’s good practice before you run code linters is to commit your actual changes, what you’re thinking about first. Then if you’re going to make big linting changes, you can do that as a separate commit and it’s very easy to see the diff of what’s changed in case you’re nervous about it. Later on, you get used to running it and you won’t need to do a separate commit, just at the beginning.

13:19 If I look at the diff, you’ll see that both of these files have been changed and you can see it’s basically messed around with some of the formatting in white space. Sure enough, in this file now, all the indentation is fixed. It’s all using two space indentation everywhere. My single quotes would appear converted to double quotes and all the extra line breaks were removed. If I’d had any trailing spaces at the end of lines like that, then they would have been removed. Over in the changelog, you can see my markdown now correctly has a blank line after this heading. My bold text has been changed from double underscores and italics with asterisks to the standard that we use, which is underscores, for italics and double asterisks for bold. Minor changes didn’t make any difference to the actual contents of the file, but now we have a nice consistent usage of markdown and YAML. Great.

14:11 If I just put this back to how it was and I will show you the other way of doing it. Save this. That’s on a command line, but I said the best way to do it is with the browser, with the code editor. Now, to do this, I’m going to use a couple of plugins with VSCode. Like I say, you can do this with basically any editor. For VSCode, if you go to the marketplace or you just go to the extensions tab here, you can search for Prettier and there’s a few ones, but it’s the obvious big one that’s got 20 million installs. Basically, you just hit install there and things should work. The same goes for EditorConfig, that there’s a plugin for VSCode. You may not need this.

15:13 Then Black is a bit more complicated because it’s actually built into VSCode. You need the Python add-on. Then if you go into settings and type Black, you’ll see that the Python formatting provider, you want to set that to Black. That it runs Black whenever you edit Python code. If you’re never going to edit Python code in the tools package, don’t worry about that. Okay. I’ve got them installed and set up. Now, if you’re doing this on some other projects, it’s outside of nf-core, it’s important to note, we have a few files in the roots of the repo here, which are helping us out. For the EditorConfig tool, this guy, we have an EditorConfig config file, it sets everything up. It says what the indent line should be and things. For prettier, we have a prettier_ignore file, which is like a gitignore file, tells it to ignore certain stuff. A prettier RC file, which has some settings. Here, we just specify that we want the width to be 120. I can’t remember what the default is, but it’s quite narrow. We don’t change very much. There are some config files in there.

16:19 Then once that’s installed, I can go to the command pallet here. That was command+shift+P for me, because I’m on the Mac. You can see there’s an option saying format documents. I can run that and, oof, done. That’s a bit better, but command line again, it’s still not great because I still need to actually remember to run that command all the time. We’re trying to get away from having to remember anything. If I go back into settings here, if I look for format on save, there’s the magic tick button here, editor format on save. This tells VSCode to do what it says: to format your files whenever you hit save. Now if I go to the change log markdown file and make some change, if I hit save, you’ll see that Prettier ran and it fixed this extra line break and it fixed the markdown here. I can put a bunch of trailing spaces in here, which are there. If I hit save, they disappear because Prettier runs every single time I hit save. This then is the state that I recommend you to run in, where, every time you hit save, everything just is formatted automatically and you don’t need to worry about it and all the tests should pass.

17:30 OK, final, I’m running a bit late, so final little bit, sometimes Prettier breaks or doesn’t do what you want it to do or there’s some specific reason that you want to ignore stuff. I said that there’s this prettier_ignore file. One of the things that we ignore is, like in the new release that’s going to come out for the template any minute, is this email template file, because this broke because of Prettier. We’ve got a groovy code mixed into HTML and that confuses it. There are sometimes legitimate reasons to ignore the code linters. If you want to ignore an entire file, you can stick the file name into the prettier_ignore file. If you look at the Prettier documentation, you will see that there’s the section about ignoring code. I talks about that file that I just mentioned, and it also says that you can use the keywords within the file. Basically, you make comments and you say prettier ignore, and that will just ignore a chunk of that file. There’s a way to do this. This just ignores that one line and then Prettier will continue to pass. But of course, this is exceptional use case only. Most of the time you don’t need to do this and you should just let prettier do its thing.

18:49 Final couple of slides just to wrap up then. Yes, the final mention. I mentioned XAML and Markdown, those are two of the biggest offenders here, and also Jason and a few others. But of course, wouldn’t it be amazing if we could do this with Nextflow code? And we had a standard about indentation after inputs and output blocks and script blocks and everything. Those of us who have worked for a while, and especially who are used to code linters and using them and like them, would absolutely love this. But it doesn’t exist yet. However, Prettier can handle plugins. That’s how the website does the PHP code. It’s a sort of semi-unofficial plugin. There is some interest within the nf-core and Nextflow communities to build something like this. The other day we were talking about this and Edmund actually kicked off a new Slack channel called Prettier plugin Nextflow. There’s nothing really to see there yet, but if you’re interested and especially if you have any experience in doing it or would like to help out or just get involved, go and check out that Slack channel and join in. Because please, please, I want it. I imagine you’ll make a lot of friends if you can make this work. Nf-core is all about standardization and best practices. This would really be icing on the cake for that. Right.

20:33 With that, I’m happy to take any questions, be it about linting or anything else. You can hit me. There’s no one moderating today, so I would just keep an eye on the Slack chat. There’s a couple of things in there. Or just unmute yourself, I believe.

(question) There’s a question about what do I use for Nextflow at the moment?

(answer) Yeah, EditorConfig. That’s one of the main reasons that EditorConfig is in there. That’s a general use tool. It’s very, very simple. Like I said, it just checks the indentation. Two space, Four space. That’s about it. It’s still possible to do pretty variable formatting, but at least it’s something. Like I said, hopefully we’ll have a Prettier plugin one day.

21:21 (question) Anders asks, have I ever come across that Black introduces bugs in my Python code?

(answer) No, I quite like that actually when I hit save, if I don’t see things move around, that usually means that my Python code is invalid and there’s a syntax error somewhere because Black can’t run on it. It’s actually the other way around. I think Black probably saves me from bugs by alerting me early to the fact that there might be something fishy going on. I’ve never seen it introduce a bug. No, sometimes you might not agree with the choices it makes, especially Black, it defaults to a very narrow column width. It breaks loads of stuff over lots and lots of lines quite quickly unless you change that default. But I’ve never seen it actually introduce errors in bugs. Maura says that he likes the shorter 80 or 88 character line length. This is personal preference. I think that we’ve got it set to 120 at the moment, partly because that’s what I have everything else set up to when I came into this. With the configs, so you’ll see there’s a couple of weirdish bits. I don’t know who was paying attention and might have really spotted it. I’ll go back to it again in this window. If we go to the pipeline templates and go to the EditorConfig file, you’ll see here that we’ve got some stuff to be set up to be indent size four. That’s all files. That would be like Nextflow and config files and everything indent size four. Then some stuff we’ve got indent size two. Then some stuff we’ve got like modules where it’s unsetting a bunch of stuff. It’s not clobbered, but we’re doing some stuff here. Some of these choices you might be like, well, why are we using two for some files and four for some files? That doesn’t seem very consistent. It’s the same for line length. There’s a pretty simple answer with that. It’s because we tried to minimize how many changes would be introduced into the code base when we started using these files. JSON was already set as four, I think. We tried to stick with four so that it didn’t break everything. We tried to actually minimize the number of changes which were introduced by these tools when we started using them. If we already had some standard, we tried to stick with that. Yeah, line length, you can be set in EditorConfig. I think that might be where we’re setting it. I’m not sure. Okay. That’s nice, actually. If we can set the Prettier line length in EditorConfig, then maybe we can just completely get rid of a prettier config file. That’d be nice.

24:12 Any more questions? Great. Well, thank you for sticking with me on what could be a bit of a dry topic, but hopefully a useful set of tools for everybody. As always, if you have any questions or problems, please jump into the nf-core Slack and we’ll be more than happy to help you out.