Pagefault We write about things, mostly programming related

Fixing bugs - Part 1: Emacs eshell

I hope this will be a continuing series of posts where I’ll go through bugs I fix in various FOSS projects. Mainly Emacs and Firefox. This is the first post where I’ll show you my first patch to the great editor EMACS.

The Bug

Emacs has a great little program called eshell which is a shell like command line interface. It implements some sh like features but also of course integrates great with emacs as it is written in eshell. For example you can do ls and list stuff in directories, you can also kill processes, and this is where our bug surfaces.

In bash you can do stuff like kill -9 <pid> or kill -SIGKILL <pid> to send the appropriate kill signals to the process that you are killing. We would expect eshell/kill to also handle this, especially since the documentation says it will :)

So we fire up eshell (M-x eshell RET) and try it!

$ kill -9 123
kill: bad pid: -9

Not really what we expect and if we give it a named signal such as SIGKILL it just fail silently.

Let’s look at what’s going on. eshell/kill is defined in esh-proc.el. The first step is to just see what’s going on when we execute the function. To do that we can just turn edebug on the function (this is done by going to the end of the function definition and press C-u C-M-x)!

(defun eshell/kill (&rest args)
  "Kill processes.
Usage: kill [-<signal>] <pid>|<process> ...
Accepts PIDs and process objects."
  ;; If the first argument starts with a dash, treat it as the signal
  ;; specifier.
  (let ((signum 'SIGINT))
	(let ((arg (car args))
		  (case-fold-search nil))
	  (when (stringp arg)
		 ((string-match "\\`-[[:digit:]]+\\'" arg)
		  (setq signum (abs (string-to-number arg))))
		 ((string-match "\\`-\\([[:upper:]]+\\|[[:lower:]]+\\)\\'" arg)
		  (setq signum (abs (string-to-number arg)))))

Above is the first part of the function and when we step through when it’s called like kill -9 123 we see that the (when (stringp arg) part evaluates to nil. This is because the -9 is not a string since it will be converted to an integer.

By chance I had read some other eshell code and had noticed this line:

(put 'eshell/rm 'eshell-no-numeric-conversions t)

So if we start by doing this for eshell/kill maybe we solve some part of the bug? ~~~ emacs-lisp (put ‘eshell/kill ‘eshell-no-numeric-conversions t) ~~~

We do the same process and now we see why the arguments where converted to integers! We get the error kill: invalist argument type: string, looking at the code it seems like we expect arg to be an number since the signal-process function expects the pid as a number.

	(while args
	  (let ((arg (if (eshell-processp (car args))
					 (process-id (car args))
				   (car args))))
		(when arg
		   ((null arg)
			(error "kill: null pid.  Process may actually be a network connection."))
		   ((not (numberp arg))
			(error "kill: invalid argument type: %s" (type-of arg)))
		   ((and (numberp arg)
				 (<= arg 0))
			(error "kill: bad pid: %d" arg))
			(signal-process arg signum)))))

This should be easy to fix, we just convert it ourselves in the function.

-  (car args))))
+  (string-to-number (car args)))))

Great! Now we are able to actually parse numeric arguments! How about named signals like SIGKILL? Let’s try it!

Again we just fail silently in signal-process again. Placing my cursor and pressing M-. takes me to the definition of this function. This is implemented in C and if we read the doc string we notice that SIGCODE can either be an integer or a symbol, not a string which is what we are sending. So we need to make a symbol of the signal we want to send if it is a named signal. How do we do that?

My first patch used the function make-symbol but after getting some great code review from Noam and Eli the mentioned that the appropriate function is intern, they do the same thing but make-symbol is generally used for macros as Noam explained to me:

Usually, make-symbol is only for macros, where you want a symbol that is not `eq’ to any other. Sometimes it’s handy for making a unique object at run-time. In this case, since we only care about the symbol name, it doesn’t really matter, it’s just a bit surprising to see it because it’s usually a mistake in a non-macro context.

So we add that to our function and the code part of the patch is complete! ~~~ diff - (setq signum (abs (string-to-number arg))))) + (setq signum (intern (substring arg 1))))) ~~~

Now all that was left is to update the documentation and NEWS file. If you want to see the full patch it is at:

Thanks for reading and as always when you fix a bug you get to watch the best part of Erlang the Movie

Fixing a bug!