Skip to content

Simplified plugin impl, removed unreliable onResume#152

Merged
jglick merged 4 commits intojenkinsci:masterfrom
jglick:overhaul
Jul 30, 2024
Merged

Simplified plugin impl, removed unreliable onResume#152
jglick merged 4 commits intojenkinsci:masterfrom
jglick:overhaul

Conversation

@jglick
Copy link
Member

@jglick jglick commented Jul 29, 2024

Cleaning up after #48.

@jglick jglick added the bug label Jul 29, 2024
<changelist>999999-SNAPSHOT</changelist>
<gitHubRepo>jenkinsci/${project.artifactId}-plugin</gitHubRepo>
<jenkins.version>2.387.3</jenkins.version>
<jenkins.baseline>2.440</jenkins.baseline>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just updating to a recommended baseline while we are here.

Comment on lines -10 to -11
* might change in between usages and where it is not possible to propagate
* the object in other ways.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just pass the Launcher to each call.

import java.io.IOException;

/**
* Need an interface in order to export the object from the channel.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this was actually true prior to #48, but it is certainly not true now—this interface was not being passed to Channel.export.

import hudson.model.TaskListener;

/**
* Extension point for ssh-agent providers.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is just a single impl for years, this was unnecessary complexity.

* @throws Throwable if things go wrong.
* @deprecated use {@link #SSHAgentEnvironment(hudson.Launcher, hudson.model.BuildListener, java.util.List)}
*/
@Deprecated
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to keep these long-deprecated overloads around; who would call them?

Comment on lines -361 to +309
Map<String, Throwable> faults = new LinkedHashMap<>();
for (RemoteAgentFactory factory : Jenkins.get().getExtensionList(RemoteAgentFactory.class)) {
if (factory.isSupported(launcher, listener)) {
try {
listener.getLogger().println("[ssh-agent] " + factory.getDisplayName());
agent = factory.start(new SingletonLauncherProvider(launcher), listener,
workspace != null ? SSHAgentStepExecution.tempDir(workspace) : null);
break;
} catch (Throwable t) {
faults.put(factory.getDisplayName(), t);
}
}
}
if (agent == null) {
listener.getLogger().println("[ssh-agent] FATAL: Could not find a suitable ssh-agent provider");
listener.getLogger().println("[ssh-agent] Diagnostic report");
for (Map.Entry<String, Throwable> fault : faults.entrySet()) {
listener.getLogger().println("[ssh-agent] * " + fault.getKey());
StringWriter sw = new StringWriter();
fault.getValue().printStackTrace(new PrintWriter(sw));
for (String line : StringUtils.split(sw.toString(), "\n")) {
listener.getLogger().println("[ssh-agent] " + line);
}
}
throw new RuntimeException("[ssh-agent] Could not find a suitable ssh-agent provider.");
}
this.agent = agent;
agent = new ExecRemoteAgent(launcher, listener);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there is no list of factories to consult, the code is simpler and the error handling should be as well: either running ssh-agent works, or it does not.

@Override
public void buildEnvVars(Map<String, String> env) {
env.put("SSH_AUTH_SOCK", agent.getSocket());
env.putAll(agent.getEnv());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably only this one variable really needs to be passed to nested steps, but ssh-agent(1) suggests you just eval the env, so why not?

Comment on lines -26 to +32
public class SSHAgentStep extends AbstractStepImpl implements Serializable {

private static final long serialVersionUID = 1L;
public class SSHAgentStep extends Step {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of long-deprecated Guice usage.

return new StandardUsernameListBoxModel()
.includeMatchingAs(
item instanceof Queue.Task ? Tasks.getAuthenticationOf((Queue.Task)item) : ACL.SYSTEM,
item instanceof Queue.Task ? Tasks.getAuthenticationOf2((Queue.Task)item) : ACL.SYSTEM2,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spring Security

Comment on lines +46 to +49
} catch (Exception x) {
cause.addSuppressed(x);
}
super.stop(cause);
Copy link
Member Author

@jglick jglick Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better, so we pass along interruptions and so on as is. Otherwise I think I have seen builds whose terminating error was supposedly running ssh-agent -k, when this was just a side effect.

Comment on lines -81 to -86
@Override
public void onResume() {
super.onResume();
try {
purgeSockets();
initRemoteAgent();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This never made sense, and probably contributed to the instability of the plugin. Why would we stop and restart the SSH agent process, running on some agent computer, just because the controller restarted?


/**
* Purges all socket files created previously.
* Especially useful when Jenkins is restarted during the execution of this step.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something else dating to #4 but of unclear purpose. We already call ssh-agent -k, which deletes the socket file.

Comment on lines -216 to -217
File socket = new File(it.next());
if (socket.exists()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, this was looking up the socket path on the controller filesystem, where it would never exist. So this was dead code AFAICT.

* Runs a native SSH agent installed on a system.
*/
public class ExecRemoteAgent implements RemoteAgent {
public final class ExecRemoteAgent implements Serializable {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK to serialize now that this merely contains a Map<String, String>, and means that the StepExecution does not need to recreate it.

@Override
public boolean isSupported(Launcher launcher, final TaskListener listener) {
try {
int status = launcher.launch().cmds("ssh-agent", "-k").quiet(true).start().joinWithTimeout(1, TimeUnit.MINUTES, listener);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointless overhead; we are just going to run it for real shortly afterwards.

@jglick jglick changed the title Simplified plugin impl Simplified plugin impl, removed unreliable onResume Jul 30, 2024
@jglick jglick marked this pull request as ready for review July 30, 2024 01:23
@jglick jglick requested a review from a team as a code owner July 30, 2024 01:23
@jglick jglick merged commit 8933585 into jenkinsci:master Jul 30, 2024
@jglick jglick deleted the overhaul branch July 30, 2024 12:13

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file being replaced by something. I've got failures all over the place with the removal of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific please? Do you have a Pipeline build which was in the middle of sshagent at the time of an upgrade? I do not think there is a functional test for that yet…

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the server is restarted, with the latest release from a few hours ago of ssh-agent, I have other plugins failing on start because they are looking for LauncherProvider class. Notice in your latest merges/release this class has been removed.

So trying to determine if this LauncherProvider has been replaced with something so I can at the very least alter my own plugin to use the new setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other plugins failing on start because they are looking for LauncherProvider

Which plugins? There are no outside references to this class that I am aware of.

if this LauncherProvider has been replaced with something

No.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yes there are some outside references. I think I see how to modify my own code by looking at the changes. If I figure out the other plugins I'll let you know, chances are if your unaware it's going to be something unregistered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think there is a functional test for [a Pipeline build which was in the middle of sshagent at the time of an upgrade] yet

Tried to write one, but it looked hard since the plugin depends on a specific process running. Tested interactively and the step survived upgrade.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file being replaced by something. I've got failures all over the place with the removal of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you be more specific please? Do you have a Pipeline build which was in the middle of sshagent at the time of an upgrade? I do not think there is a functional test for that yet…

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the server is restarted, with the latest release from a few hours ago of ssh-agent, I have other plugins failing on start because they are looking for LauncherProvider class. Notice in your latest merges/release this class has been removed.

So trying to determine if this LauncherProvider has been replaced with something so I can at the very least alter my own plugin to use the new setup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other plugins failing on start because they are looking for LauncherProvider

Which plugins? There are no outside references to this class that I am aware of.

if this LauncherProvider has been replaced with something

No.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yes there are some outside references. I think I see how to modify my own code by looking at the changes. If I figure out the other plugins I'll let you know, chances are if your unaware it's going to be something unregistered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think there is a functional test for [a Pipeline build which was in the middle of sshagent at the time of an upgrade] yet

Tried to write one, but it looked hard since the plugin depends on a specific process running. Tested interactively and the step survived upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants