24 Commity

Autor SHA1 Wiadomość Data
b4320f611b Release v0.8, minor README changes 2022-10-22 17:20:20 +02:00
1b1ab2387e gui: PreviewGeneratorPdf: Guard cache lookup with mutex
No guarantes the read-only lookup is thread-safe so better
just lock there too
2022-10-22 15:08:29 +02:00
49a1a14009 gui: previewgenerator: Use QHash and guard using mutexes 2022-10-22 15:07:46 +02:00
48ca25abe3 gui: mainwindow: Reorder members for readability 2022-10-19 11:56:21 +02:00
42e9ac5f41 gui: previews: Ensure order matches relevance ranking
Previously, the order of previews would depend simply
on which generator would finish first.

Fix this by caching out of order previews. This may
cause a small delay but should overall be hardly noticable.
2022-10-19 11:11:29 +02:00
7d3c24e6e1 update README.md,HACKING.md 2022-10-18 16:19:25 +02:00
c155d25a37 shared: sqlitesearch: Search trigram index too
Search the trigram index too, combining the results
with the results of the "normal" fts index.

Prioritize the latter since it makes more sense to
rank whole words higher.
2022-10-18 16:06:10 +02:00
583d5babf3 shared: sqlitedbservice: Insert to trigram index too 2022-10-18 16:05:19 +02:00
45659cdc59 shared: migrations: Add 4.sql: Begin trigram index 2022-10-18 16:04:00 +02:00
3022bbdfb5 sqlitesearch: escapeftsArgument: Fix wrong escaping of phrase queries 2022-10-02 19:55:10 +02:00
b6ac652ade shared: indexer: Report progress more often
Processing dirs with large docs takes time and waiting till the threshold
is reached can be a bit annoying in those cases, so report if last report
was 10+ seconds ago.
2022-09-23 20:08:00 +02:00
785a517d62 Release v0.7 2022-09-10 15:15:39 +02:00
bfb5d71448 submodules: exile.h: Update 2022-09-10 15:15:39 +02:00
3e512b8be0 USAGE.md: Update 2022-09-10 15:15:31 +02:00
2ab6e40d44 gui: mainwindow: Set comboPreviewFiles size policy to Maximum
The current expanding setting resizes, somehow, the mainwindow
when a path name is quite long.
2022-08-28 17:48:16 +02:00
31f0568a87 shared: LimitQueue: Change limit type to int
More consistent with "QQueue::size()" and silences warning
2022-08-28 13:10:39 +02:00
238f9add49 gui: Set a more reasonable maximum width for previews
They won't be this large but in particular for vertical scroll,
this makes way more sense.
2022-08-28 13:05:33 +02:00
7c63ee9178 gui: mainwindow: Set center alignment for previews
Noticable now that we have vertical scrolling
2022-08-28 13:04:06 +02:00
1edfcc8f23 gui: PreviewGeneratorPlainText: Escape html before working on text
We use this semi-HTML mode to highlight words, but if we already
have tags in the document this does not work quite well.

Thus, escape the string before further processing it
2022-08-28 13:01:46 +02:00
2df273dee3 gui: PreviewGenerator*: Fallback to partial highlighting if no whole word match 2022-08-28 12:44:42 +02:00
5a47f5949f gui: PreviewGeneratorPlainText: Rework snippets selection
The current snippet selection is useless for many queries.

Attempt a more reasonable snippet selection by prioritizing
those where all words are contained. The more words a snippet
has, the more important it is considered.Therefore, those will
be at the top.

Only highlight whole words
2022-08-27 22:18:43 +02:00
e6a0c0daee gui: PreviewGeneratorPdf: Only highlight whole words
Only highlight whole words, which is less confusing
2022-08-27 22:17:10 +02:00
11b070ed42 gui: mainwindow: Improve preview status reporting
Better represent the state of available previews.

Only add paths to ui->comboPreviewFiles once
2022-08-27 12:04:35 +02:00
47874b3706 gui: ipcworker,ipcserver: Refactor
Crashes were observed, faulting in libQtNetwork.
Those were rather rare. We also have no traces.

Probably depends on some order signal/slots were
processed. Remove shared states between connections,
such as the IPCPreviewWorker and socket instance in IPCServer.
2022-08-27 11:15:45 +02:00
20 zmienionych plików z 355 dodań i 125 usunięć

Wyświetl plik

@ -1,4 +1,31 @@
# looqs: Release notes
## 2022-10-22 - v0.8
CHANGES:
- For new, not previously indexed files, start creating an additional index using sqlite's experimental trigram tokenizer. Thanks to that, we can now match substrings >= 3 of an unicode sequence. Results of the usual index are prioritized.
- GUI: Ensure order of previews matches ranking exactly. Previously, it depended simply on the time preview generators took, i. e. it was more or less a race.
- Report progress more often during indexing, so users don't get the impression that it's stuck when processing dirs with large documents.
- Fix a regression that caused phrase queries to be broken
- Minor improvements
- Add packages: Ubuntu 22.10.
## 2022-09-10 - v0.7
CHANGES:
- GUI: Add vertical scroll option, default to it. Most feedback considered horizontal scrolling unnatural.
- GUI: Previews: Improve plaintext preview snippet selection by prioritizing those which contain the most search terms
- GUI: Previews: Only highlight whole words, not parts in words.
- GUI: Previews: Don't treat text wrapped inside '<' '>' in plaintext files as HTMl tags which caused such text to get lost in previews.
- GUI: Avoid triggering preview generation in some cases even when previews tab is not the active one
- GUI: Implement a search history. Allow going up/down with arrow keys in the search field.
- GUI: Previews: Allow CTRL + mouse wheel to zoom in on previews
- General: Fix an incorrect sqlite query which caused the ranking information of search results to be lost. "All files" filter in the previews tab therefore now orders by the seemingly most relevant pages, across all documents.
- General: Fix handling of content search queries with prefix terms (those ending in '*').
- GUI: Show how many results are previewable.
- GUI: Refactor to improve stability of sandboxed IPC worker in order to avoid rare segfaults.
## 2022-08-14 - v0.6
This release features multiple fixes and enhancements.

Wyświetl plik

@ -1,12 +1,7 @@
# looqs - Hacking
## Introduction
Without elaborating here, I hacked looqs because I was not satisfied with the state of desktop search on Linux.
Originally a set of CLI python scripts, it is now written in C++ and offers a GUI made using Qt. While a "web app" would have been an option, I prefer a desktop application for something like looqs. I chose Qt because I am more familiar with it than with any other GUI framework. To my knowledge, potential alternatives like GTK do not include as many "batteries" as Qt anyway, so the job presumably would have been harder there,
at least for me.
If you are interested in how to contribute, please see the file [CONTRIBUTING.md](CONTRIBUTING.md) which contains the instructions on how to submit patches etc.
If you are interested on how to contribute, please see the file [CONTRIBUTING.md](CONTRIBUTING.md) which contains the instructions on how to submit patches etc.
## Security
The architecture ensures that the parsing of documents and the preview generation is sandboxed by [exile.h](https://github.com/quitesimpleorg/exile.h). looqs uses a multi-process architecture to achieve this.
@ -16,7 +11,8 @@ Qt code is considered trusted in this model. While one may critize this, it was
Set the enviornment variable `LOOQS_DISABLE_SANDBOX=1` to disable sandboxing. It's intended for troublehshooting.
## Database
The heart is sqlite, with the FTS5 extensions behind the full-text search. While FTS may not be sqlite's strong suit, I definitly did not want to run one of those oftenly recommended heavy (Java based) solutions. I explored other options like Postgresql, I've discard them due to some limitations back then.
The heart is sqlite, with the FTS5 extensions behind the full-text search. While FTS may not be sqlite's strong suit, I definitely did not want to run one of those oftenly recommended heavy (Java based) solutions. I explored other options like Postgresql, I've discard them due to some limitations back then. It's also natural to use sqlite as it's
used for metadata in general.
Down the road, alternatives will be explored of course if sqlite should not suffice anymore.
@ -27,7 +23,7 @@ looqs simply strips the tags and that seems to work fine so far. Naturally, this
Naturally, looqs won't be able to index and render previews for everything. Such approach would create a huge bloated binary. In the future, there will be some plugin system of some sorts, either we will load .so objects or use subprocesses.
## Name
looqs looks for files. You as the user can also look inside them. The 'k' in "looks" was replaced by a 'q'. Originally, I wanted my projects to have "qs" (for quitesimple) in their name. While abandoned now, this got us to looqs.
looqs looks for files. You as the user can also look inside them. The 'k' in "looks" was replaced by a 'q'. Originally, I wanted my projects to have "qs" (for quitesimple) in their name. While that quirk is abandoned now, this got us to looqs.

Wyświetl plik

@ -28,9 +28,12 @@ There is no need to write the long form of filters. There are also booleans avai
The screenshots in this section may occasionally be slightly outdated, but they are usually recent enough to get an overall impression of the current state of the GUI.
## Current status
Latest version: 2022-08-14, v0.6
Latest version: 2022-10-22, v0.8
Please see [Changelog](CHANGELOG.md) for a human readable list of changes.
Please keep in mind: looqs is still at an early stage and may exhibit some weirdness and contain bugs.
Please see [Changelog](CHANGELOG.md) for a human readable list of changes. For download instructions, see
further down this document.
## Goals and principles
@ -94,7 +97,7 @@ The GUI is located in `gui/looqs-gui`, the binary for the CLI is in `cli/looqs`
## Packages
At this point, looqs is not in any official distro package repo, but I maintain some packages.
### Ubuntu 22.04
### Ubuntu 22.04, 22.10
Latest release can be installed using apt from the repo.
```
# First, obtain key, assume it's trusted.

Wyświetl plik

@ -47,16 +47,23 @@ you can quickly perform content searches in paths containing 'docs'.
**CTRL + W**: Removes the last filter. If we take above's example "p:(docs) c:(invoice credit card)" again, then CTRL + W kills "c:(invoice credit card)".
The arrow keys (up and down) can be used to go back and forward in the search history.
### Configuring PDF viewer
It's most convenient if, when you click on a preview, the PDF reader opens the page you clicked. For that, looqs needs to know which viewer you want to launch.
It tries to auto detect some common viewers. You must set the value in the "Settings" tab yourself if the
default does not work for you. In the command line options, "%f" represents the filepath, "%p" the page number.
### Preview tab
The preview tab shows previews. It marks your search keywords too. Click on a preview to open the file.
### Previews tab
The 'previews' tab shows previews. It marks your search keywords too. Click on a preview to open the file.
A right click on a preview allows you to copy the file path, or to open the containing folder.
When the combobox is set to "All previews", the previews are ordered by relevance from all documents/pages.
By default, a vertical scrolling is active. In the settings, it can be changed to horizontal scroll, which may be
preferred by users of (larger) wide screen monitors.
### Syncing index
Over time, files get deleted or their content changes. Go to **looqs** -> **Sync index**. looqs will reindex the content of files which have been changed. Files that cannot be found anymore will be removed from the index.

Wyświetl plik

@ -18,6 +18,10 @@ class IPCPreviewWorker : public QObject
IPCPreviewWorker(QLocalSocket *peer);
void start(RenderConfig config, const QVector<RenderTarget> &targets);
void stop();
~IPCPreviewWorker()
{
delete this->peer;
}
private slots:
void shutdownSocket();

Wyświetl plik

@ -15,6 +15,7 @@
#include <QFileDialog>
#include <QScreen>
#include <QProgressDialog>
#include <QDesktopWidget>
#include "mainwindow.h"
#include "ui_mainwindow.h"
#include "clicklabel.h"
@ -642,11 +643,11 @@ void MainWindow::previewReceived(QSharedPointer<PreviewResult> preview, unsigned
{
QString docPath = preview->getDocumentPath();
auto previewPage = preview->getPage();
ClickLabel *headerLabel = new ClickLabel();
headerLabel->setText(QString("Path: ") + preview->getDocumentPath());
ClickLabel *label = dynamic_cast<ClickLabel *>(preview->createPreviewWidget());
label->setMaximumWidth(QApplication::desktop()->availableGeometry().width() - 200);
QVBoxLayout *previewLayout = new QVBoxLayout();
@ -678,11 +679,29 @@ void MainWindow::previewReceived(QSharedPointer<PreviewResult> preview, unsigned
previewLayout->setMargin(0);
previewLayout->insertStretch(0, 1);
previewLayout->insertStretch(-1, 1);
previewLayout->setAlignment(Qt::AlignCenter);
QWidget *previewWidget = new QWidget();
previewWidget->setLayout(previewLayout);
ui->scrollAreaWidgetContents->layout()->addWidget(previewWidget);
QBoxLayout *layout = static_cast<QBoxLayout *>(ui->scrollAreaWidgetContents->layout());
int pos = previewOrder[docPath + QString::number(previewPage)];
if(pos <= layout->count())
{
layout->insertWidget(pos, previewWidget);
for(auto it = previewWidgetOrderCache.constKeyValueBegin();
it != previewWidgetOrderCache.constKeyValueEnd(); it++)
{
if(it->first <= layout->count())
{
layout->insertWidget(it->first, it->second);
}
}
}
else
{
previewWidgetOrderCache[pos] = previewWidget;
}
}
}
@ -806,6 +825,7 @@ void MainWindow::handleSearchResults(const QVector<SearchResult> &results)
ui->comboPreviewFiles->clear();
ui->comboPreviewFiles->addItem("All previews");
ui->comboPreviewFiles->setVisible(true);
ui->lblTotalPreviewPagesCount->setText("");
bool hasDeleted = false;
QHash<QString, bool> seenMap;
@ -816,7 +836,6 @@ void MainWindow::handleSearchResults(const QVector<SearchResult> &results)
if(!seenMap.contains(absPath))
{
seenMap[absPath] = true;
QString fileName = pathInfo.fileName();
QTreeWidgetItem *item = new QTreeWidgetItem(ui->treeResultsList);
@ -830,17 +849,18 @@ void MainWindow::handleSearchResults(const QVector<SearchResult> &results)
bool exists = pathInfo.exists();
if(exists)
{
if(!result.wasContentSearch)
if(result.wasContentSearch)
{
continue;
}
if(!pathInfo.suffix().contains("htm")) // hack until we can preview them properly...
{
if(PreviewGenerator::get(pathInfo) != nullptr)
if(!pathInfo.suffix().contains("htm")) // hack until we can preview them properly...
{
this->previewableSearchResults.append(result);
ui->comboPreviewFiles->addItem(result.fileData.absPath);
if(PreviewGenerator::get(pathInfo) != nullptr)
{
this->previewableSearchResults.append(result);
if(!seenMap.contains(result.fileData.absPath))
{
ui->comboPreviewFiles->addItem(result.fileData.absPath);
}
}
}
}
}
@ -848,6 +868,7 @@ void MainWindow::handleSearchResults(const QVector<SearchResult> &results)
{
hasDeleted = true;
}
seenMap[absPath] = true;
}
ui->treeResultsList->resizeColumnToContents(0);
@ -863,6 +884,7 @@ void MainWindow::handleSearchResults(const QVector<SearchResult> &results)
}
QString statusText = "Results: " + QString::number(results.size()) + " files";
statusText += ", previewable: " + QString::number(this->previewableSearchResults.count());
if(hasDeleted)
{
statusText += " WARNING: Some files are inaccessible. No preview available for those. Index may be out of sync";
@ -932,6 +954,10 @@ void MainWindow::makePreviews(int page)
renderConfig.scaleY = QGuiApplication::primaryScreen()->physicalDotsPerInchY() * (currentScale / 100.);
renderConfig.wordsToHighlight = wordsToHighlight;
this->previewOrder.clear();
this->previewWidgetOrderCache.clear();
int previewPos = 0;
QVector<RenderTarget> targets;
for(SearchResult &sr : this->previewableSearchResults)
{
@ -946,6 +972,7 @@ void MainWindow::makePreviews(int page)
renderTarget.path = sr.fileData.absPath;
renderTarget.page = (int)sr.page;
targets.append(renderTarget);
this->previewOrder[renderTarget.path + QString::number(renderTarget.page)] = previewPos++;
}
int numpages = ceil(static_cast<double>(targets.size()) / previewsPerPage);
ui->spinPreviewPage->setMaximum(numpages);

Wyświetl plik

@ -23,16 +23,6 @@ class MainWindow : public QMainWindow
{
Q_OBJECT
public:
explicit MainWindow(QWidget *parent, QString socketPath);
~MainWindow();
signals:
void beginSearch(const QString &query);
void startPdfPreviewGeneration(QVector<SearchResult> paths, double scalefactor);
protected:
void closeEvent(QCloseEvent *event) override;
private:
DatabaseFactory *dbFactory;
SqliteDbService *dbService;
@ -40,37 +30,40 @@ class MainWindow : public QMainWindow
IPCPreviewClient ipcPreviewClient;
QThread ipcClientThread;
QThread syncerThread;
Indexer *indexer;
IndexSyncer *indexSyncer;
QProgressDialog progressDialog;
Indexer *indexer;
QFileIconProvider iconProvider;
bool previewDirty;
QSqlDatabase db;
QFutureWatcher<QVector<SearchResult>> searchWatcher;
void add(QString path, unsigned int page);
QVector<SearchResult> previewableSearchResults;
LooqsQuery contentSearchQuery;
QVector<QString> searchHistory;
int currentSearchHistoryIndex = 0;
QString currentSavedSearchText;
QHash<QString, int> previewOrder; /* Quick lookup for the order a preview should have */
QMap<int, QWidget *>
previewWidgetOrderCache /* Saves those that arrived out of order to be inserted later at the correct pos */;
bool previewDirty;
int previewsPerPage;
unsigned int processedPdfPreviews;
unsigned int currentPreviewGeneration = 1;
void connectSignals();
void makePreviews(int page);
bool previewTabActive();
bool indexerTabActive();
void keyPressEvent(QKeyEvent *event) override;
unsigned int processedPdfPreviews;
void handleSearchResults(const QVector<SearchResult> &results);
void handleSearchError(QString error);
LooqsQuery contentSearchQuery;
int previewsPerPage;
void createSearchResutlMenu(QMenu &menu, const QFileInfo &fileInfo);
void openDocument(QString path, int num);
void openFile(QString path);
unsigned int currentPreviewGeneration = 1;
void initSettingsTabs();
int currentSelectedScale();
void processShortcut(int key);
bool eventFilter(QObject *object, QEvent *event);
QVector<QString> searchHistory;
int currentSearchHistoryIndex = 0;
QString currentSavedSearchText;
private slots:
void lineEditReturnPressed();
void treeSearchItemActivated(QTreeWidgetItem *item, int i);
@ -90,6 +83,16 @@ class MainWindow : public QMainWindow
void startIpcPreviews(RenderConfig config, const QVector<RenderTarget> &targets);
void stopIpcPreviews();
void beginIndexSync();
public:
explicit MainWindow(QWidget *parent, QString socketPath);
~MainWindow();
signals:
void beginSearch(const QString &query);
void startPdfPreviewGeneration(QVector<SearchResult> paths, double scalefactor);
protected:
void closeEvent(QCloseEvent *event) override;
};
#endif // MAINWINDOW_H

Wyświetl plik

@ -7,7 +7,7 @@
<x>0</x>
<y>0</y>
<width>1280</width>
<height>888</height>
<height>923</height>
</rect>
</property>
<property name="windowTitle">
@ -27,7 +27,7 @@
<enum>QTabWidget::South</enum>
</property>
<property name="currentIndex">
<number>3</number>
<number>1</number>
</property>
<widget class="QWidget" name="resultsTab">
<attribute name="title">
@ -82,7 +82,7 @@
<x>0</x>
<y>0</y>
<width>1244</width>
<height>598</height>
<height>633</height>
</rect>
</property>
<layout class="QHBoxLayout" name="horizontalLayout"/>
@ -165,7 +165,7 @@
<item>
<widget class="QComboBox" name="comboPreviewFiles">
<property name="sizePolicy">
<sizepolicy hsizetype="Expanding" vsizetype="Fixed">
<sizepolicy hsizetype="Maximum" vsizetype="Fixed">
<horstretch>0</horstretch>
<verstretch>0</verstretch>
</sizepolicy>

Wyświetl plik

@ -1,20 +1,24 @@
#include "../shared/common.h"
#include "previewgenerator.h"
#include <QMutexLocker>
#include "previewgeneratorpdf.h"
#include "previewgeneratorplaintext.h"
#include "previewgeneratorodt.h"
static PreviewGenerator *plainTextGenerator = new PreviewGeneratorPlainText();
static QMap<QString, PreviewGenerator *> generators{
static QHash<QString, PreviewGenerator *> generators{
{"pdf", new PreviewGeneratorPdf()}, {"txt", plainTextGenerator}, {"md", plainTextGenerator},
{"py", plainTextGenerator}, {"java", plainTextGenerator}, {"js", plainTextGenerator},
{"cpp", plainTextGenerator}, {"c", plainTextGenerator}, {"sql", plainTextGenerator},
{"odt", new PreviewGeneratorOdt()}};
static QMutex generatorsMutex;
PreviewGenerator *PreviewGenerator::get(QFileInfo &info)
{
QMutexLocker locker(&generatorsMutex);
PreviewGenerator *result = generators.value(info.suffix(), nullptr);
locker.unlock();
if(result == nullptr)
{
if(Common::isTextFile(info))

Wyświetl plik

@ -1,15 +1,18 @@
#include <QMutexLocker>
#include <QPainter>
#include <QRegularExpression>
#include "previewgeneratorpdf.h"
static QMutex cacheMutex;
Poppler::Document *PreviewGeneratorPdf::document(QString path)
{
QMutexLocker locker(&cacheMutex);
if(documentcache.contains(path))
{
return documentcache.value(path);
}
locker.unlock();
Poppler::Document *result = Poppler::Document::load(path);
if(result == nullptr)
{
@ -17,7 +20,8 @@ Poppler::Document *PreviewGeneratorPdf::document(QString path)
return nullptr;
}
result->setRenderHint(Poppler::Document::TextAntialiasing);
QMutexLocker locker(&cacheMutex);
locker.relock();
documentcache.insert(path, result);
locker.unlock();
return result;
@ -45,7 +49,12 @@ QSharedPointer<PreviewResult> PreviewGeneratorPdf::generate(RenderConfig config,
QImage img = pdfPage->renderToImage(config.scaleX, config.scaleY);
for(QString &word : config.wordsToHighlight)
{
QList<QRectF> rects = pdfPage->search(word, Poppler::Page::SearchFlag::IgnoreCase);
QList<QRectF> rects =
pdfPage->search(word, Poppler::Page::SearchFlag::IgnoreCase | Poppler::Page::SearchFlag::WholeWords);
if(rects.empty())
{
rects = pdfPage->search(word, Poppler::Page::SearchFlag::IgnoreCase);
}
for(QRectF &rect : rects)
{
QPainter painter(&img);

Wyświetl plik

@ -1,4 +1,5 @@
#include <QTextStream>
#include <QRegularExpression>
#include "previewgeneratorplaintext.h"
#include "previewresultplaintext.h"
@ -57,6 +58,7 @@ QString PreviewGeneratorPlainText::generatePreviewText(QString content, RenderCo
++i;
}
resulText = resulText.toHtmlEscaped();
QString header = "<b>" + fileName + "</b> ";
for(QString &word : config.wordsToHighlight)
{
@ -74,10 +76,19 @@ QString PreviewGeneratorPlainText::generatePreviewText(QString content, RenderCo
return header + resulText.replace("\n", "<br>").mid(0, 1000);
}
struct Snippet
{
/* Contains each line number and line of the snippet*/
QString snippetText;
/* How many times a word occurs in the snippetText */
QHash<QString, int> wordCountMap;
};
QString PreviewGeneratorPlainText::generateLineBasedPreviewText(QTextStream &in, RenderConfig config, QString fileName)
{
QString resultText;
const unsigned int contextLinesCount = 2;
QVector<Snippet> snippets;
const int contextLinesCount = 2;
LimitQueue<QString> queue(contextLinesCount);
QString currentLine;
currentLine.reserve(512);
@ -85,38 +96,73 @@ QString PreviewGeneratorPlainText::generateLineBasedPreviewText(QTextStream &in,
/* How many lines to read after a line with a match (like grep -A ) */
int justReadLinesCount = -1;
auto appendLine = [&resultText](int lineNumber, QString &line)
{ resultText.append(QString("<b>%1</b>%2<br>").arg(lineNumber).arg(line)); };
struct Snippet currentSnippet;
QHash<QString, int> countmap;
QString header = "<b>" + fileName + "</b> ";
unsigned int snippetsCount = 0;
unsigned int lineCount = 0;
while(in.readLineInto(&currentLine) && snippetsCount < MAX_SNIPPETS)
auto appendLine = [&currentSnippet, &config](int lineNumber, QString &line)
{
int foundWordsCount = 0;
for(QString &word : config.wordsToHighlight)
{
QRegularExpression searchRegex("\\b" + word + "\\b");
bool containsRegex = line.contains(searchRegex);
bool contains = false;
if(!containsRegex)
{
contains = line.contains(word, Qt::CaseInsensitive);
}
if(containsRegex || contains)
{
currentSnippet.wordCountMap[word] = currentSnippet.wordCountMap.value(word, 0) + 1;
QString replacementString = "<span style=\"background-color: yellow;\">" + word + "</span>";
if(containsRegex)
{
line.replace(searchRegex, replacementString);
}
else
{
line.replace(word, replacementString, Qt::CaseInsensitive);
}
++foundWordsCount;
}
}
currentSnippet.snippetText.append(QString("<b>%1</b>%2<br>").arg(lineNumber).arg(line));
return foundWordsCount;
};
unsigned int lineCount = 0;
while(in.readLineInto(&currentLine))
{
currentLine = currentLine.toHtmlEscaped();
++lineCount;
bool matched = false;
if(justReadLinesCount > 0)
{
appendLine(lineCount, currentLine);
--justReadLinesCount;
int result = appendLine(lineCount, currentLine);
if(justReadLinesCount == 1 && result > 0)
{
justReadLinesCount = contextLinesCount;
}
else
{
--justReadLinesCount;
}
continue;
}
if(justReadLinesCount == 0)
{
resultText += "---<br>";
currentSnippet.snippetText += "---<br>";
justReadLinesCount = -1;
++snippetsCount;
snippets.append(currentSnippet);
currentSnippet = {};
}
for(QString &word : config.wordsToHighlight)
{
if(currentLine.contains(word, Qt::CaseInsensitive))
{
countmap[word] = countmap.value(word, 0) + 1;
matched = true;
currentLine.replace(word, "<span style=\"background-color: yellow;\">" + word + "</span>",
Qt::CaseInsensitive);
break;
}
}
if(matched)
@ -125,7 +171,6 @@ QString PreviewGeneratorPlainText::generateLineBasedPreviewText(QTextStream &in,
{
int queuedLineCount = lineCount - queue.size();
QString queuedLine = queue.dequeue();
appendLine(queuedLineCount, queuedLine);
}
appendLine(lineCount, currentLine);
@ -137,13 +182,77 @@ QString PreviewGeneratorPlainText::generateLineBasedPreviewText(QTextStream &in,
}
}
if(!currentSnippet.snippetText.isEmpty())
{
currentSnippet.snippetText += "---<br>";
snippets.append(currentSnippet);
}
std::sort(snippets.begin(), snippets.end(),
[](Snippet &a, Snippet &b)
{
int differentWordsA = 0;
int totalWordsA = 0;
int differentWordsB = 0;
int totalWordsB = 0;
for(int count : a.wordCountMap.values())
{
if(count > 0)
{
++differentWordsA;
}
totalWordsA += count;
}
for(int count : b.wordCountMap.values())
{
if(count > 0)
{
++differentWordsB;
}
totalWordsB += count;
}
if(differentWordsA > differentWordsB)
{
return true;
}
if(differentWordsA == differentWordsB)
{
return totalWordsA > totalWordsB;
}
return false;
});
QString resultText = "";
unsigned int snippetsCount = 0;
QString header = "<b>" + fileName + "</b> ";
QHash<QString, int> totalWordCountMap;
bool isTruncated = false;
for(Snippet &snippet : snippets)
{
if(snippetsCount++ < MAX_SNIPPETS)
{
resultText += snippet.snippetText;
}
else
{
isTruncated = true;
}
for(auto it = snippet.wordCountMap.keyValueBegin(); it != snippet.wordCountMap.keyValueEnd(); it++)
{
totalWordCountMap[it->first] = totalWordCountMap.value(it->first, 0) + it->second;
}
}
if(isTruncated)
{
header += "(truncated) ";
}
for(QString &word : config.wordsToHighlight)
{
header += word + ": " + QString::number(countmap[word]) + " ";
}
if(snippetsCount == MAX_SNIPPETS)
{
header += "(truncated)";
header += word + ": " + QString::number(totalWordCountMap[word]) + " ";
}
header += "<hr>";

Wyświetl plik

@ -152,12 +152,14 @@ void Indexer::processFileScanResult(FileScanResult result)
++this->currentIndexResult.erroredPaths;
}
if(currentScanProcessedCount++ == progressReportThreshold)
QTime currentTime = QTime::currentTime();
if(currentScanProcessedCount++ == progressReportThreshold || this->lastProgressReportTime.secsTo(currentTime) >= 10)
{
emit indexProgress(this->currentIndexResult.total(), this->currentIndexResult.addedPaths,
this->currentIndexResult.skippedPaths, this->currentIndexResult.erroredPaths,
this->dirScanner->pathCount());
currentScanProcessedCount = 0;
this->lastProgressReportTime = currentTime;
}
}

Wyświetl plik

@ -72,6 +72,8 @@ class Indexer : public QObject
IndexResult currentIndexResult;
void launchWorker(ConcurrentQueue<QString> &queue, int batchsize);
QTime lastProgressReportTime = QTime::currentTime();
public:
bool isRunning();

Wyświetl plik

@ -6,11 +6,11 @@ template <class T> class LimitQueue
{
protected:
QQueue<T> queue;
unsigned int limit = 0;
int limit = 0;
public:
LimitQueue();
LimitQueue(unsigned int limit)
LimitQueue(int limit)
{
this->limit = limit;
}
@ -34,7 +34,7 @@ template <class T> class LimitQueue
return queue.dequeue();
}
void setLimit(unsigned int limit)
void setLimit(int limit)
{
this->limit = limit;
}

3
shared/migrations/4.sql Normal file
Wyświetl plik

@ -0,0 +1,3 @@
CREATE VIRTUAL TABLE fts_trigram USING fts5(content, content='',tokenize="trigram");
ALTER TABLE content ADD COLUMN fts_trigramid integer;
CREATE INDEX content_fts_trigramid ON content (fts_trigramid);

Wyświetl plik

@ -3,5 +3,6 @@
<file>1.sql</file>
<file>2.sql</file>
<file>3.sql</file>
<file>4.sql</file>
</qresource>
</RCC>

Wyświetl plik

@ -110,6 +110,44 @@ unsigned int SqliteDbService::getFiles(QVector<FileData> &results, QString wildC
return processedRows;
}
bool SqliteDbService::insertToFTS(bool useTrigrams, QSqlDatabase &db, int fileid, QVector<PageData> &pageData)
{
QString ftsInsertStatement;
QString contentInsertStatement;
if(useTrigrams)
{
ftsInsertStatement = "INSERT INTO fts_trigram(content) VALUES(?)";
contentInsertStatement = "INSERT INTO content(fileid, page, fts_trigramid) VALUES(?, ?, last_insert_rowid())";
}
else
{
ftsInsertStatement = "INSERT INTO fts(content) VALUES(?)";
contentInsertStatement = "INSERT INTO content(fileid, page, ftsid) VALUES(?, ?, last_insert_rowid())";
}
for(const PageData &data : pageData)
{
QSqlQuery ftsQuery(db);
ftsQuery.prepare(ftsInsertStatement);
ftsQuery.addBindValue(data.content);
if(!ftsQuery.exec())
{
Logger::error() << "Failed fts insertion " << ftsQuery.lastError() << Qt::endl;
return false;
}
QSqlQuery contentQuery(db);
contentQuery.prepare(contentInsertStatement);
contentQuery.addBindValue(fileid);
contentQuery.addBindValue(data.pagenumber);
if(!contentQuery.exec())
{
Logger::error() << "Failed content insertion " << contentQuery.lastError() << Qt::endl;
return false;
}
}
return true;
}
SaveFileResult SqliteDbService::saveFile(QFileInfo fileInfo, QVector<PageData> &pageData)
{
QString absPath = fileInfo.absoluteFilePath();
@ -149,24 +187,18 @@ SaveFileResult SqliteDbService::saveFile(QFileInfo fileInfo, QVector<PageData> &
}
int lastid = inserterQuery.lastInsertId().toInt();
for(const PageData &data : pageData)
if(!insertToFTS(false, db, lastid, pageData))
{
QSqlQuery ftsQuery(db);
ftsQuery.prepare("INSERT INTO fts(content) VALUES(?)");
ftsQuery.addBindValue(data.content);
ftsQuery.exec();
QSqlQuery contentQuery(db);
contentQuery.prepare("INSERT INTO content(fileid, page, ftsid) VALUES(?, ?, last_insert_rowid())");
contentQuery.addBindValue(lastid);
contentQuery.addBindValue(data.pagenumber);
if(!contentQuery.exec())
{
db.rollback();
Logger::error() << "Failed content insertion " << contentQuery.lastError() << Qt::endl;
return DBFAIL;
}
db.rollback();
Logger::error() << "Failed to insert data to FTS index " << Qt::endl;
return DBFAIL;
}
if(!insertToFTS(true, db, lastid, pageData))
{
db.rollback();
Logger::error() << "Failed to insert data to FTS index " << Qt::endl;
return DBFAIL;
}
if(!db.commit())
{
db.rollback();

Wyświetl plik

@ -13,6 +13,7 @@ class SqliteDbService
{
private:
DatabaseFactory *dbFactory = nullptr;
bool insertToFTS(bool useTrigrams, QSqlDatabase &db, int fileid, QVector<PageData> &pageData);
public:
SqliteDbService(DatabaseFactory &dbFactory);

Wyświetl plik

@ -82,11 +82,10 @@ QString SqliteSearch::escapeFtsArgument(QString ftsArg)
{
value = value.mid(0, value.size() - 1);
}
result += "\"" + value + "\"*";
result += "\"" + value + "\"* ";
}
else
{
value = "\"\"" + value + "\"\"";
result += "\"" + value + "\" ";
}
}
@ -142,9 +141,7 @@ QPair<QString, QVector<QString>> SqliteSearch::createSql(const Token &token)
}
if(token.type == FILTER_CONTENT_CONTAINS)
{
return {" content.id IN (SELECT fts.ROWID FROM fts WHERE fts.content MATCH ? ORDER BY "
"rank) ",
{escapeFtsArgument(value)}};
return {" fts MATCH ? ", {escapeFtsArgument(value)}};
}
throw LooqsGeneralException("Unknown token passed (should not happen)");
}
@ -164,26 +161,14 @@ QSqlQuery SqliteSearch::makeSqlQuery(const LooqsQuery &query)
auto tokens = query.getTokens();
for(const Token &token : tokens)
{
if(token.type == FILTER_CONTENT_CONTAINS)
{
if(!ftsAlreadyJoined)
{
joinSql += " INNER JOIN fts ON content.ftsid = fts.ROWID ";
ftsAlreadyJoined = true;
}
whereSql += " fts.content MATCH ? ";
bindValues.append(escapeFtsArgument(token.value));
}
else
{
auto sql = createSql(token);
whereSql += sql.first;
bindValues.append(sql.second);
}
auto sql = createSql(token);
whereSql += sql.first;
bindValues.append(sql.second);
}
QString prepSql;
QString sortSql = createSortSql(query.getSortConditions());
int bindIterations = 1;
if(isContentSearch)
{
if(sortSql.isEmpty())
@ -191,12 +176,24 @@ QSqlQuery SqliteSearch::makeSqlQuery(const LooqsQuery &query)
if(std::find_if(tokens.begin(), tokens.end(),
[](const Token &t) -> bool { return t.type == FILTER_CONTENT_CONTAINS; }) != tokens.end())
{
sortSql = "ORDER BY rank";
sortSql = "ORDER BY prio, rank";
}
}
prepSql = "SELECT file.path AS path, content.page AS page, file.mtime AS mtime, file.size AS size, "
"file.filetype AS filetype FROM file INNER JOIN content ON file.id = content.fileid " +
joinSql + " WHERE 1=1 AND " + whereSql + " " + sortSql;
QString whereSqlTrigram = whereSql;
whereSqlTrigram.replace("fts MATCH", "fts_trigram MATCH"); // A bit dirty...
prepSql =
"SELECT DISTINCT path, page, mtime, size, filetype FROM ("
"SELECT file.path AS path, content.page AS page, file.mtime AS mtime, file.size AS size, "
"file.filetype AS filetype, 0 AS prio, fts.rank AS rank FROM file INNER JOIN content ON file.id = "
"content.fileid "
"INNER JOIN fts ON content.ftsid = fts.ROWID WHERE 1=1 AND " +
whereSql +
"UNION ALL SELECT file.path AS path, content.page AS page, file.mtime AS mtime, file.size AS size, "
"file.filetype AS filetype, 1 as prio, fts_trigram.rank AS rank FROM file INNER JOIN content ON file.id = "
"content.fileid " +
"INNER JOIN fts_trigram ON content.fts_trigramid = fts_trigram.ROWID WHERE 1=1 AND " + whereSqlTrigram +
" ) " + sortSql;
++bindIterations;
}
else
{
@ -216,11 +213,14 @@ QSqlQuery SqliteSearch::makeSqlQuery(const LooqsQuery &query)
QSqlQuery dbquery(*db);
dbquery.prepare(prepSql);
for(const QString &value : bindValues)
for(int i = 0; i < bindIterations; i++)
{
if(value != "")
for(const QString &value : bindValues)
{
dbquery.addBindValue(value);
if(value != "")
{
dbquery.addBindValue(value);
}
}
}
return dbquery;