The previous commits have put rules for linting and formatting via ruff, yapf, mypy and pyright into place. They are checked with the make check target, and this commit adds the fixes for the target to succeed.
It does some refactoring where type checking dug up dirty bits, and also adds lots of churn in the Python code. To a good deal, that's owed to mere formatting changes. It would have been better to seperate those from syntax and refactoring fixes into multiple commits, so that the interesting changes don't drown in the formatting nose. However, that would have been a lot of additional work only to be thrown away by later commits, hence this commit has a big diff in one piece. The size of the diff is regrettable but hopefully a one-off: What it buys is automatic format checking for CI and predictble formats for smaller diffs in the future.
Rules that "make check" enforces are, in the following order
- Syntax checkers:
- ruff check .
- mypy .
- pyright
- Format check:
- yapf --diff --recursive .
The refactoring includes:
- Turn the Result class into a more elaborate object, capable of
doing more heavy lifting around stderr and stdout decoding,
summarizing outcome, and matching error strings.
Aside from fixing broken type checks, this also removes lots of
boilerplate calling code which is currently used for handling
possible call outcome scenarios. Trying to access an inexistent,
decoded string should raise a meaningful exception by itself now,
which removes lots of code with case distinctions.
- Fix Cmd type hierarchy:
- Add the AbstractCmd class above Cmd. This is necessary because
the checker rightfully complains it can't instantiate a Cmd
instance where constructor arguments were needed. They never
were, but the type used at the instantiating code's location in
jw.pkg.App so claims.
- Lots of sub- and sub-subcommands are derived from the base
class of the invoking command. That provides some properties
shared across the ancestor hierarchy of a command, but is
semantically unsound. Fix that by introducing jw.pkg.BaseCmd
class as a place to provide basic helpers shared across all
commands used in a jw.pkg.App's context, and derive all command
classes from that afresh. The parent command is still reachable
via a common parent property.
Formatting changes are conforming to PEP-8, mostly, with minor tweaks. All in all they include the following changes.
- Remove # -*- coding: utf-8 -*-
The line was needed by Python 2 which is not supported anylonger.
For Python 3, the default encoding is UTF-8, anyway.
- Allow to run "make py-format" without having it produce any
changes. It's basically "yapf --in-place --recursive ." with some
code style settings, see conf/topdir/pyproject.toml. The settings
may be debatable. I've had custom tweaks in place on that target,
too, but then again, IDEs would have more hassle to integrate
that.
- Introduce a 88 character line length limit
- One import per line, reshuffle them semantically, see
[tool.isort] in pyproject.toml.
- Hide imports needed for type-checking only behind
if TYPE_CHECKING
- Spaces around assignments accounts for much churn. Having having
no spaces in inline parameter list assignments and default
parameter values would arguably be more compact where it's
useful. On the other hand, I have not found a code formatter
which allows spaces around assignments in parameter lists broken
into one per line and that's often better than a wall of text.
- Add two spaces before # export, as this seems to be mandated by
PEP-8
Add an async open() method which should allow to do what __init__() couldn't, because it's not async, and to match the already existing .close(). It's called by __aenter__() __aexit__() if the FileContext is instantiated as context manager, or at will when the user finds it a good idea.
Commit a19679fec reverted the first attempt to make AsyncSSH reuse one connection during an instance lifetime. That failed because a lot of distribution-specific properties were filled in a new event loop thread started by AsyncRunner, and AsyncSSH didn't like that.
The last commit provided the needed properties as members of the Distro class. This commit is the second part of the solution: Keep one connection around as a class member and reuse it on every _run() invocation.
The name of the env parameter to ExecContext.run() and .sudo() is not descriptive enough for which environment is supposed to be modified and how, so rename and split it up as follows:
- .run(): env -> mod_env
- .sudo(): env -> mod_env_sudo and mod_env_cmd
The parameters have the following meaning:
- "mod_env*" means that the environment is modified, not replaced
- "mod_env" and "mod_env_cmd" modify the environment "cmd" runs in
- "mod_env_sudo" modifies the environment sudo runs in
Fix the fallout of the API change all over jw-pkg.
All methods are async and call their protected counterpart, which is designed to be overridden. If possible, default implementations do something meaningful, if not, they just raise plain NotImplementedError.
Add the parameter "atomic" to put() / _put(). If instructs the implementation to take extra precautions to make sure the operation either succeeds or fails entirely, i.e. doesn't leave a broken target file behind.
cmd_input is passed as None to _run(), which is legal, but then used in a call to cmd_run(), which is a public API and, hence, illegal. InputMode.NonInteractive should be used instead, do that.
run_cmd() is a thin layer over the public ExecContext API, which falls back to using a Local instance if not other ExecContext is specified explicitly. Both the default Local context as the subsequent call to run() should have the same idea about interactivity, so allowing to specify it in two parameters ("interactive" and "cmd_input") is a bad idea. Remove "interactive".
Allow to configure via the environment which class ssh_client() picks. Can currently be exec, asyncssh, paramiko or a comma-separated search list. The list will be tried through until a class is found that can be instantiated.
Reusing AsyncSSH's connection is fine and fast, but only if it's not combined with the AsyncRunner. See commit 67e51cf0 why it was introduced in the first place, along with a reasoning why it may be a bad idea. Looks like we're now reaping what we sowed.
The current plan to get this to fly is to sprinkle async / await all over the code paths to App.os_release(). That is a lot of churn, so postpone and revert for now to keep CI working.
File "~/local/src/jw.dev/proj/jw-pkg/scripts/jw/pkg/lib/ec/ssh/AsyncSSH.py", line 463, in _run_ssh
return await self._run_on_conn(
^^^^^^^^^^^^^^^^^^^^^^^^
...<7 lines>...
)
^
File "~/local/src/jw.dev/proj/jw-pkg/scripts/jw/pkg/lib/ec/ssh/AsyncSSH.py", line 403, in _run_on_conn
proc = await conn.create_process(
^^^^^^^^^^^^^^^^^^^^^^^^^^
...<7 lines>...
)
^
File "/usr/lib/python3.13/site-packages/asyncssh/connection.py", line 4492, in create_process
chan, process = await self.create_session(
^^^^^^^^^^^^^^^^^^^^^^^^^^
SSHClientProcess, *args, **kwargs) # type: ignore
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/asyncssh/connection.py", line 4385, in create_session
session = await chan.create(session_factory, command, subsystem,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<4 lines>...
bool(self._agent_forward_path))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/asyncssh/channel.py", line 1149, in create
packet = await self._open(b'session')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/asyncssh/channel.py", line 717, in _open
return await self._open_waiter
^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Task <Task pending name='Task-1' coro=<App.__run() running at ~/local/src/jw.dev/proj/jw-pkg/scripts/jw/pkg/lib/App.py:137> cb=[_run_until_complete_cb() at /usr/lib64/python3.13/asyncio/base_events.py:181]> got Future <Future pending> attached to a different loop
Add class Curl as the first pure FileTransfer class without _run() / _sudo(). It doesn't use any PycURL / libcurl-like binding, but runs the curl binary in a subprocess. That looks the most portable still.
For now, move the definiions of Result, Input and InputMode from ExecContext into lib.base. Having to import them from the ExecContect module is too heavy-handed for those simple types.
The Input instance passed as cmd_input to ExecContext.run() and .sudo() currently may be of type str. Allow to pass bytes, too.
At the same time, disallow None to be passed as cmd_input. Force the caller to be more explicit how it wants input to be handled, notably with respect to interactivity.
Along the way fix a bug: Content in cmd_input should result in CallContext.interactive == False but doesn't. Fix that.
_run_ssh() of ssh.Exec doesn't pass throw=False to run_cmd(), which makes it throw exceptions, and effectively strips the caller of any chance to get hold of stdout and stderr. Pass throw=False and let run() decide according the the caller-provided throw parameter whether or not a problem should propagate up as exception or return value.
ssh_client() tries a predefined order of client class implementations until it finds a workable candidate. For testing all, it's desirable to be able to target the exact class. Add a "type" parameter to achieve that.
I'm aware that type is also a function. But the semantics look so compelling to me that I'm using the variable name anyway.
Naively join()ing a command list to be executed remotely via SSH also quotes shell operators which doesn't work, of course. Work around that. The workaround will not always work but covers lots of cases.
Instantiating a SSHClient-derived class with an invalid or missing uri parameter is accepted and fails later down the road. Raise an Exception early on to make the error log more comprehensible.
The SSHClient classes Paramiko and Exec are exported via # export. This is a bad idea, because if Paramiko is not installed, none of the other's can be instantiated either: On the attempt to load them, __init__.py is loaded first and fails. SSHClient.ssh_client() knows what to do, no need to auto-import them into the lib.ec.ssh module.
Request a remote PTY from AsyncSSH, and wire the local terminal's stdin up with it if interactive == True. This gives a real interactive session if local stdin belongs to a terminal. Also, thanks to AsyncSSH understanding that, forward terminal size changes to the remote end.
Add a SSHClient implementation using AsyncSSH. This is the first and currently only class derived from SSHClient which implements SSHClient.Cap.LogOutput, designed to consume and log command output as it streams in. It felt like the lower hanging fruit not to do that with Paramiko: Paramiko doesn't provide a native async API, so it would need to spawn additional worker threads. I think.
Add an optional caps ("capabilities") argument to the constructor of SSHClient. It is meant to be used by derived classes in order to declare that they don't want the base class to handle a default behaviour for a certain capability, but that they want to implement it themselves instead.
Also, give the _run_ssh() callbacks the necessary info as parameters, so that the derived classes have the means to do so.
Since commit 02697af5, ExecContext.run() returns bytes for stdout and stderr and fixes that in calling code. The thing it did not fix was the code calling run_cmd(), which also made return bytes. This commit catches up on that.
Move the code of SSHClientInternal and SSCClientCmd into lib.ec.ssh, as "Paramiko" and "Exec", respectively. This makes the class layout a little more modular, and along the way fixes a bug where SSHClientInternal could be instantiated but was unusable (if the Paramiko is not installed).
ExecContext's .sudo() omits many of run()'s parameters, and this commit adds them. To avoid redundancy around repeating and massaging the long parameter list of both functions and their return values, it also adds some deeper changes:
- Make run(), _run(), sudo() and _sudo() always return instances of
Result. Before it was allowed to return a triplet of stdout,
stderr, and exit status.
- Have ExecContext stay out of the business of decoding the result
entirely. Result provides a convenience method .decode()
operating on stdout and stderr and leaves the decision to the
caller.
This entails miniscule adaptations in calling code, namely in
App.os_release, util.get_profile_env() and CmdListRepos._run().
- Wrap the _run() and _sudo() callbacks in a context manager object
of type CallContext to avoid code duplication.
- Consistently name the first argument to run(), _run(), sudo() and
_sudo() "cmd", not "args". The latter suggests that the caller is
omitting the executable, which is not the case.
Take a positional uri argument to the constructor of ExecContext, forcing SSHClient to follow suit. The latter was instantiated with a hostname as only argument up to now, which still works as a special case of an uri.