Download
We're evolving to serve you better! This current forum has transitioned to read-only mode. For new discussions, support, and engagement, we've moved to GitHub Discussions.

Integrate a simple built-in Search Engine

  • #3043
    Avatar photo[anonymous]

    Greetings. I know I may be asking for too much but here goes.

    1. As a static html site publisher, Publii is directly competing with softwares like HelpnDoc etc. which also make fully Google indexable static html sites with advanced built-in search, though I fully agree they may not make as attractive and seo sites as Publii makes.

    2. Google Search is totally blocked in China and 25+ other countries fully or partially maybe.

    3. Plus Google site search loads google scripts on our Publii’s static html website and has to communicate with google servers just for the site search, which may also lead to slowing down the site in the countries where google search is blocked.

    So I just want to request if you can integrate some built-in .html indexing script any time in the near or far future which will index all our html posts when we export and upload via sftp and show its own search box. Then we won’t have to depend on the google site search engine at all.

    Just like Google search results, it should show the search query as highlighted around the context in which it appears in the post in its search results whihc most search engine scripts do.

    I am willing to donate if you need to buy the license for such a script in case it is not available in open-source libraries.

    #3064
    Avatar photo[anonymous]

    This would be a nice feature. I’d be willing to donate for this as well. I’m not very fond of google either.

    #3065
    Avatar photo[anonymous]

    Thanks so much Mathias. So if Tomasz or Bob let us know, we will both be glad to chip in for this. It does not have been to done right away. Whenever convenient.

    #3068
    Avatar photoBob

    We have a plan to create our own search engine in Publii, most likely based on the FlexiSearch, but we want to distribute it as a plugin, so first, we need to add support for plugins that will probably take some time. I will not promise but it will probably appear in the fourth quarter of the year.

    #3069
    Avatar photo[anonymous]

    Wow, just wow dear Bob. You just made my day. Thanks for sharing. Still if you need any financial help for it, please let us know.

    Please also kindly consider a plugin for my other request for exporting the blog as a fully offline epub ebook with the post titles as the epub TOC. I am sure many will buy that plugin also. Thanks again.

    #3072
    Avatar photo[anonymous]

    That’s great news. I wonder how a search function, which is by definition something dynamic, could work on a static website? Some javascript that indexes keywords and links into 1 static page? That index could become quite huge with one of my websites that has more than 2000 posts 🙂

    #3074
    Avatar photoBob

    This is exactly how it works, we need to generate an additional file with the index of all your content. To avoid generating a huge file, we plan to add several options, such as:

    1. index the post titles only
    2. index post titles+headings (H1-H6)
    3. index posts titles + post entry
    4. maybe the other ones if you have any good idea…
    #3075
    Avatar photo[anonymous]

    Excellent, Bob! Keep up the good work.

    #3077
    Avatar photo[anonymous]

    Thanks for explaining in detail Bob. Please do show the search hits in the relevant context or its exact position in the para with presetable characters (if possible) before and after the search hit in theexact location in its para, as google search does. I think FlexiSearch or any search script would do this. I will donate for this in addition to buying the plugin too.

    I think Mathias since a text index does not take up so much space at all, even your 2000 posts would be no issue even for a full content search like this.

    I know people export their docs and knowledgebases of tens of thousands of pages with such softwares like helpndo, etc. which create their own seamless and instant js search scripts built-in which generate such text indexes. And ajax wordpress plugins which do this by indexing 1000s of posts in a text index file. Of course I may be wrong in this but just sharing my observations.

    #3079
    Avatar photo[anonymous]

    NitaaiKumar,

    Don’t forget Publii is a static website builder. It cannot dynamically query some database table with an index. All data of all possible searches needs to be present in 1 file. That limits the options.

    Mathias

    #3080
    Avatar photo[anonymous]

    Oh yes, gotcha Mathias.

    The help and docs sites generated by softwares like HelpnDoc with tens of thousands of pages are also purely html with no database or dynamic elements at all. But their self-generated search on export works so great in the sidebar with their static site index for 1000s of pages. So if there is nice static-index generating search script, I think it can easily handle tens of thousands of posts.

    When I was talking on WordPress plugins, I was talking of a specific plugin which I have used before, which has an option to actually build a static search text index file to cache and speed up the search with no load on the dynamic database whatsoever. And this worked fine with a blog with a 1000 posts too.

    #4780
    Avatar photo[anonymous]

    I am so delighted to read Publii’s Own Search Engine is planned for 2020 on  your roadmap page and you also said in the 4th quarter of 2020. So can we still expect it before the end of 2020 even though the plugin support has not been introduced yet? Would be really the best gift for 2021 please if possible or even in Jan-Feb 2020. Any rough ETA which you are expecting, thanks so much?

    #4785
    Avatar photoBob

    It’s postponed to next year, as the search engine seems like a great plugin idea, we decided to roll out the plugins system first.

    #4787
    Avatar photo[anonymous]

    Thanks for the info, so first hopefully the first quarter of next year?

    #4791
    Avatar photoBob

    It’s a free and still non-profit project so we develop it in our spare time and it is difficult to tell when exactly it will be done.

    #4873
    Avatar photo[anonymous]

    @OP : In the meantime, I don’t know what’s your technical background but you can look for alternatives to google search, there are many. For example, the 1st 10,000 requests per month (usually more than enough for a personal blog site) are free when using Algolia (which I think is used in some Publii themes from what I remember).
    When looking for other resources to be used w/ static generators, have a look here or here.

    #4959
    Avatar photo[anonymous]

    Something like this works locally to the user and the static html, https://stork-search.net/. Unless your static site is really big, it could be easy to integrate into publii, generate the index, store as a .js file, sync/upload to host, all done. No plugins needed.

    #4960
    Avatar photo[anonymous]

    That looks promising!

    #4985
    Avatar photo[anonymous]
    [anonymous] wrote:

    we need to generate an additional file with the index of all your content. To avoid generating a huge file, we plan to add several options, such as:

    1. index the post titles only
    2. index post titles+headings (H1-H6)
    3. index posts titles + post entry
    4. maybe the other ones if you have any good idea…

    5. index by className, like .publii__index or something like that. Or by some custom item in the editor.

    It might be a good idea to have checkboxes, so users can pick and choose which items to index.

    #6054
    Avatar photo[anonymous]

    I would like this too, for now I have disabled the search, since I don’t like google.

    #6057
    Avatar photoBob

    By the end of this year, the first plugins should be released, and maybe this one as well.

    #6516
    Avatar photo[anonymous]

    What’s the status of either plugins or an embedded search engine?

    I was thinking about this search engine the other day when I was compiling some doxygen docs for a project at work. doxygen self-hosts a search engine in static html with css/js that all runs off a filesystem, no webserver needed. I’m willing to fund a bounty or hire someone to write a search engine that works with the publii static html output, and get it included in the mainline publii release.

    #6543
    Avatar photo[anonymous]

    Another option for static websites is SearchIQ, which is currently running a lifetime deal: https://appsumo.com/products/searchiq/

    I see that the ProDocs theme uses Algolia.