-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Add full-text search #24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hm, not sure. One of the largest advantages of this software is portability and zero configuration. Server with search endpoint definitely breaks these. |
|
Also, |
|
|
||
| def search_archive(archive_path, pattern, regex=False): | ||
| args = '-gi' if regex else '-Qig' | ||
| ag = run(['ag', args, pattern, archive_path], stdout=PIPE, stderr=PIPE, timeout=60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: confirm this isn't a potential source of shell-injection vulns
|
You may be right, I'll take a closer look today. Not sure what you mean by "hence no full-text search", do you mean no full-text fuzzy-search? |
|
I mean, no stemming or synonyms, so "programs" won't match "programming". Not sure if this is 100% necessary though. |
|
I've been leaning towards adding a backend for a little while now, because that way we can run a GUI with a submit links page that allows you to upload bookmark exports or link third party sites for constant syncing. I feel a backend is the natural next step as this grows in complexity, but I do agree that it would be nice to keep things static and backendless until absolutely necessary. |
|
Elasticlunr looks pretty good! I just have to figure out how to serialize a pre-built index and load it into client's browsers (which they don't seem to mention how to do in the docs). |
|
I think it should be http://elasticlunr.com/docs/index.js.html#toJSON Example: http://elasticlunr.com/example/example_index.json (used in their example, see http://elasticlunr.com/example/app.js) |
|
Ahh it's in the API reference, thanks, I should've looked there first. I'll mockup an integration with it sometime this week or next. |
|
Hmm one of the nice parts of having a backend is that it'll make the archiving interface a lot easier to use for non-technical people, which is currently a blocker to this becoming more widely used. It would be trivial to add a page where people can upload/manage their export files, and eventually configure the backend with live syncing, etc. I'm kind of ok with it being a command-line only thing for now, since I don't want to sap away at mozilla's Pocket revenue stream, but at the same time it would be really nice to provide a one-command setup script that lets you add new export files via http://localhost:8086 or something. I'll keep thinking about this. Been super busy these last few weeks, working on launching a beta for my company, so probably wont have much time to code up elasticlunr just yet. |
|
Or maybe keep it unix-way and provide web UI as a separate application? I could be calling script via OS or use it as a library. |
|
Definitely, that was my plan with the Flask backend. Any web UI stuff would be run via |
|
Depending on how complicated you're willing to get, another option that would allow the addition of any number of backend services would be to package up the application as a docker image and only support that. Discourse.org does this and it allows anyone technically proficient enough to setup docker to get a server up and running locally, without having to install the myriad dependencies. ( see https://github.com/discourse/discourse/blob/master/docs/INSTALL.md#why-do-you-only-officially-support-docker ) |
|
Yup @ivar good intuition, that's already been on my roadmap for a while: #65 (comment) I'm just swamped these days with my day job, so improving BA is slow going. I'm actually almost done with the docker/docker-compose setup which includes BA itself and nginx to serve it, I will definitely add a container for the backend once that's complete too. |
|
Closing this for now until the django backend is released, then I'll open a new ticket for adding full-text search in the new django app. |
I added full-text search of the wget archives using
ag(the silver searcher). A simple Flask app provides the search endpoint to the frontend.Instructions were added to the readme for how to run the
agsearch backend, or how to use./search.pyfrom the CLI.