20 次程式碼提交

作者 SHA1 備註 提交日期
02dd7b64b5 gui: PreviewGeneratorOdt: Adjust for DocumentProcessResult 2023-05-14 14:28:11 +02:00
7c30124743 fixup! shared: Add DocumentProcessResult 2023-05-14 14:27:22 +02:00
6a8323f2cf cli: cli.pro: Add poppler include path 2023-05-14 14:27:06 +02:00
763bc47a89 shared: LooqsQuery: Add outline search filters 2023-05-14 14:23:57 +02:00
517e62dca2 shared: sqlitesearch: Begin outline search 2023-05-14 14:23:21 +02:00
0f47f581b3 shared: SqliteDbService: Add insertOutline(), Use DocumentProcessResult 2023-05-14 14:21:22 +02:00
18b18d5103 shared: FileSaver: Use DocumentProcessResult 2023-05-14 14:20:25 +02:00
f4eed7a6ef fixup! shared: processors: Use DocumentProcessResult instead of PageData vector 2023-05-14 14:19:28 +02:00
6878f7846a shared: token: Add FILTER_OUTLINE_CONTAINS 2023-05-14 14:19:07 +02:00
b2ae0e488f PdfProcessor: Extract outline from documents 2023-05-14 14:15:50 +02:00
02a371b81e shared: processors: Use DocumentProcessResult instead of PageData vector 2023-05-14 14:14:59 +02:00
d960570171 shared: Add DocumentProcessResult
This should be returned by processors
2023-05-14 14:07:15 +02:00
c5713f5839 shared: Introduce DocumentOutlineEntry 2023-05-14 14:06:24 +02:00
8550506517 migrations: Add 6.sql: Begin outline index 2023-05-14 14:05:26 +02:00
2b1dc72410 Release v0.9, Update docs 2023-05-07 20:18:55 +02:00
22fee1d064 shared: TokenType: FILTER_TAG_ASSIGNED is not a content search token 2023-05-07 17:11:31 +02:00
50a5c399c4 gui: PreviewGeneratorPlainText: Don't add header if we have nothing to show 2023-05-07 17:11:31 +02:00
4b3ebb08c2 cli: commandadd: Improve help message 2023-05-07 17:11:31 +02:00
4c5643e342 cli,shared: Add remove, show and list for tags 2023-05-07 17:11:31 +02:00
e8d217e191 cli: CommandAdd: Add verbose (-v) 2023-05-07 17:11:31 +02:00
共有 37 個文件被更改,包括 530 次插入105 次删除

查看文件

@ -1,5 +1,24 @@
# looqs: Release notes
## 2023-05-07 - v0.9
Highlights: Tag support. Also begin new index mode to only index metadata (currently only path + file size, more to come).
Note: Upgrading can take some time as new column indexes will be added
CHANGES:
- gui: Improve font rendering in previews
- gui: Allow indexing only metadata
- gui: Allow adding content for files which only had metadata indexed before
- gui: Allow assigning tags by right clicking on paths
- cli: "add" command: Implement --verbose (-v)
- cli: "add" command: Implement --no-content and --fill-content
- cli: Add "tag" command which allows managing tags for paths.
- search: Add "tag:()", "t:()" filters
- Minor improvements and refactorings under the hood
- Add packages: Ubuntu 23.04.
## 2022-11-19 - v0.8.1
CHANGES:

查看文件

@ -28,7 +28,7 @@ There is no need to write the long form of filters. There are also booleans avai
The screenshots in this section may occasionally be slightly outdated, but they are usually recent enough to get an overall impression of the current state of the GUI.
## Current status
Latest version: 2022-11-19, v0.8.1
Latest version: 2023-05-07, v0.9
Please keep in mind: looqs is still at an early stage and may exhibit some weirdness and contain bugs.
@ -76,7 +76,7 @@ To build on Ubuntu and Debian, clone the repo and then run:
```
git submodule init
git submodule update
sudo apt install build-essential qtbase5-dev libpoppler-qt5-dev libuchardet-dev libquazip5-dev
sudo apt install build-essential qtbase5-dev libqt5sql5-sqlite libpoppler-qt5-dev libuchardet-dev libquazip5-dev
qmake
make
```
@ -97,7 +97,9 @@ The GUI is located in `gui/looqs-gui`, the binary for the CLI is in `cli/looqs`
## Packages
At this point, looqs is not in any official distro package repo, but I maintain some packages.
### Ubuntu 22.04, 22.10
### Ubuntu 23.04, 22.10, 22.04
Latest release can be installed using apt from the repo.
```
# First, obtain key, assume it's trusted.
@ -108,6 +110,8 @@ echo "deb [arch=amd64 signed-by=/usr/share/keyrings/repo.quitesimple.org.gpg] ht
sudo apt-get update
sudo apt-get install looqs
```
### Gentoo (EXPERIMENTAL)
Available in this overlay: https://github.com/quitesimpleorg/quitesimple-overlay
### Prebuilt tarball (distro-agnostic) (EXPERIMENTAL)
looqs is also distributed as a tarball containing prebuilt binaries and its library dependencies. The tarball is
@ -134,7 +138,7 @@ An AppImage may accompany the tarball in the future.
### Other distros
I'll probably add a package for voidlinux at some point and maybe will provide a Gentoo ebuild. However, I would appreciate help for others distros. If you create a package, let me know!
I appreciate help for others distros. If you create a package, let me know!
### Signature verification

查看文件

@ -165,6 +165,8 @@ A number of search filters are available.
| path.begins:(term) | pb:(term) | Filters path beginning with the specified term |
| contains:(terms) | c:(terms) | Full-text search, also understands quotes |
| limit:(integer) | - | Limits the number of results. The default is 1000. Say "limit:0" to see all results |
| tag:(tagname) | t:(tagname) | Filter for files that have been tagged with the corresponding tag |
Filters can be combined. The booleans AND and OR are supported. Negations can be applied too, except for c:(). Negations are specified with "!".
The AND boolean is implicit and thus entering it strictly optional.
@ -177,11 +179,5 @@ Examples:
|p:(notes) (pe:(odt) OR pe:(docx))          |Finds files such as notes.docx, notes.odt but also any .docs and .odt when the path contains the string 'notes'|
|memcpy !(pe:(.c) OR pe:(.cpp))| Performs a FTS search for 'memcpy' but excludes .cpp and .c files.|
|c:("I think, therefore")|Performs a FTS search for the phrase "I think, therefore".|
|c:("invoice") Downloads|This query is equivalent to c:("invoice") p:("Downloads")|
|c:("invoice") Downloads|Equivalent to c:("invoice") p:("Downloads")|
|p:(Downloads) invoice|Equivalent to c:("invoice") p:("Downloads")|

查看文件

@ -46,6 +46,8 @@ packagesExist(quazip1-qt5) {
}
INCLUDEPATH += $$PWD/../shared
INCLUDEPATH += /usr/include/poppler/qt5/
DEPENDPATH += $$PWD/../shared
win32-g++:CONFIG(release, debug|release): PRE_TARGETDEPS += $$OUT_PWD/../shared/release/libshared.a

查看文件

@ -41,13 +41,13 @@ int CommandAdd::handle(QStringList arguments)
{
QCommandLineParser parser;
parser.addOptions({{{"c", "continue"},
"Continue adding files, don't exit on first error. If this option is not given, looqs will "
"exit asap, but it's possible that a few files will still be processed. "
"Continue adding files, don't exit on first error. Exit code will be 0. If this option is not "
"given, looqs will "
"exit asap, but it's possible that a few files will still be processed."
"Set -t 1 to avoid this behavior, but processing will be slower. "},
{{"n", "no-content"}, "Only add paths to database. Do not index content"},
{{"v", "verbose"}, "Print paths of files being processed"},
{{"f", "fill-content"}, "Index content for files previously indexed with -n"},
{"tags", "Comma-separated list of tags to assign"},
{{"t", "threads"}, "Number of threads to use.", "threads"}});
parser.addHelpOption();
parser.addPositionalArgument("add", "Add paths to the index",
@ -111,10 +111,9 @@ int CommandAdd::handle(QStringList arguments)
{
IndexResult indexResult = indexer->getResult();
int newlyAdded = indexResult.results.count() - currentResult.results.count();
int newOffset = 0;
if(newlyAdded > 0)
{
newOffset = indexResult.results.count() - newlyAdded;
int newOffset = indexResult.results.count() - newlyAdded;
for(int i = newOffset; i < indexResult.results.count(); i++)
{
auto result = indexResult.results.at(i);

查看文件

@ -3,14 +3,39 @@
#include "logger.h"
#include "tagmanager.h"
bool CommandTag::ensureAbsolutePaths(const QVector<QString> &paths, QVector<QString> &absolutePaths)
{
for(const QString &path : paths)
{
QFileInfo info{path};
if(!info.exists())
{
Logger::error() << "Can't add tag for file " + info.absoluteFilePath() + " because it does not exist"
<< Qt::endl;
return false;
}
QString absolutePath = info.absoluteFilePath();
if(!this->dbService->fileExistsInDatabase(absolutePath))
{
Logger::error() << "Only files that have been indexed can be tagged. File not in index: " + absolutePath
<< Qt::endl;
return false;
}
absolutePaths.append(absolutePath);
}
return true;
}
int CommandTag::handle(QStringList arguments)
{
QCommandLineParser parser;
parser.addPositionalArgument("add", "Adds a tag to a file",
"add [tag] [paths...]. Adds the tag to the specified paths");
parser.addPositionalArgument("remove", "Removes a file associated to tag", "remove [tag] [file]");
parser.addPositionalArgument("remove", "Removes a path associated to a tag", "remove [tag] [path]");
parser.addPositionalArgument("delete", "Deletes a tag", "delete [tag]");
parser.addPositionalArgument("list", "Lists paths associated with a tag, or all tags", "list [tag]");
parser.addPositionalArgument("show", "Lists tags associated with a path", "show [path]");
parser.addHelpOption();
parser.parse(arguments);
@ -21,9 +46,8 @@ int CommandTag::handle(QStringList arguments)
parser.showHelp(EXIT_FAILURE);
return EXIT_FAILURE;
}
TagManager tagManager{*this->dbService};
QString cmd = args[0];
qDebug() << cmd;
if(cmd == "add")
{
if(args.length() < 3)
@ -33,28 +57,14 @@ int CommandTag::handle(QStringList arguments)
return EXIT_FAILURE;
}
QString tag = args[1];
auto paths = args.mid(2).toVector();
for(int i = 0; i < paths.size(); i++)
{
QFileInfo info{paths[i]};
if(!info.exists())
{
Logger::error() << "Can't add tag for file " + info.absoluteFilePath() + " because it does not exist"
<< Qt::endl;
return EXIT_FAILURE;
}
QString absolutePath = info.absoluteFilePath();
if(!this->dbService->fileExistsInDatabase(absolutePath))
{
Logger::error() << "Only files that have been indexed can be tagged. File not in index: " + absolutePath
<< Qt::endl;
return EXIT_FAILURE;
}
paths[i] = absolutePath;
}
QVector<QString> paths = args.mid(2).toVector();
TagManager tagManager{*this->dbService};
bool result = tagManager.addPathsToTag(tag, paths);
QVector<QString> absolutePaths;
if(!ensureAbsolutePaths(paths, absolutePaths))
{
return EXIT_FAILURE;
}
bool result = tagManager.addPathsToTag(tag, absolutePaths);
if(!result)
{
Logger::error() << "Failed to assign tags" << Qt::endl;
@ -62,6 +72,82 @@ int CommandTag::handle(QStringList arguments)
}
return EXIT_SUCCESS;
}
if(cmd == "list")
{
return 0;
QString tag;
if(args.length() >= 2)
{
tag = args[1];
}
QVector<QString> entries;
if(tag.isEmpty())
{
entries = tagManager.getTags();
}
else
{
entries = tagManager.getPaths(tag);
}
for(const QString &entry : entries)
{
Logger::info() << entry << Qt::endl;
}
}
if(cmd == "remove")
{
if(args.length() < 3)
{
Logger::error() << "Not enough arguments provided. 'remove' requires a tag followed by at least one path"
<< Qt::endl;
return EXIT_FAILURE;
}
QString tag = args[1];
QVector<QString> paths = args.mid(2).toVector();
QVector<QString> absolutePaths;
if(!ensureAbsolutePaths(paths, absolutePaths))
{
return EXIT_FAILURE;
}
if(!tagManager.removePathsForTag(tag, absolutePaths))
{
Logger::error() << "Failed to remove path assignments" << Qt::endl;
return EXIT_FAILURE;
}
}
if(cmd == "delete")
{
if(args.length() != 2)
{
Logger::error() << "The 'delete' command requires the tag to delete" << Qt::endl;
return EXIT_FAILURE;
}
if(!tagManager.deleteTag(args[1]))
{
Logger::error() << "Failed to delete tag" << Qt::endl;
return EXIT_FAILURE;
}
}
if(cmd == "show")
{
if(args.length() != 2)
{
Logger::error() << "The 'show' command requires a path to show the assigned tags" << Qt::endl;
return EXIT_FAILURE;
}
QString path = args[1];
QVector<QString> absolutePaths;
if(!ensureAbsolutePaths({path}, absolutePaths))
{
return EXIT_FAILURE;
}
QVector<QString> tags = tagManager.getTags(absolutePaths.at(0));
for(const QString &entry : tags)
{
Logger::info() << entry << Qt::endl;
}
}
return EXIT_SUCCESS;
}

查看文件

@ -4,6 +4,9 @@
class CommandTag : public Command
{
protected:
bool ensureAbsolutePaths(const QVector<QString> &paths, QVector<QString> &absolutePaths);
public:
using Command::Command;

查看文件

@ -24,7 +24,7 @@ QSharedPointer<PreviewResult> PreviewGeneratorOdt::generate(RenderConfig config,
throw LooqsGeneralException("Error while reading content.xml of " + documentPath);
}
TagStripperProcessor tsp;
QString content = tsp.process(entireContent).constFirst().content;
QString content = tsp.process(entireContent).pages.constFirst().content;
PreviewGeneratorPlainText plainTextGenerator;
result->setText(plainTextGenerator.generatePreviewText(content, config, info.fileName()));

查看文件

@ -246,17 +246,21 @@ QString PreviewGeneratorPlainText::generateLineBasedPreviewText(QTextStream &in,
totalWordCountMap[it->first] = totalWordCountMap.value(it->first, 0) + it->second;
}
}
if(isTruncated)
if(!resultText.isEmpty())
{
header += "(truncated) ";
}
for(QString &word : config.wordsToHighlight)
{
header += word + ": " + QString::number(totalWordCountMap[word]) + " ";
}
header += "<hr>";
if(isTruncated)
{
header += "(truncated) ";
}
for(QString &word : config.wordsToHighlight)
{
header += word + ": " + QString::number(totalWordCountMap[word]) + " ";
}
header += "<hr>";
return header + resultText;
resultText = header + resultText;
}
return resultText;
}
QSharedPointer<PreviewResult> PreviewGeneratorPlainText::generate(RenderConfig config, QString documentPath,

查看文件

@ -24,7 +24,9 @@ QString DefaultTextProcessor::processText(const QByteArray &data) const
return {};
}
QVector<PageData> DefaultTextProcessor::process(const QByteArray &data) const
DocumentProcessResult DefaultTextProcessor::process(const QByteArray &data) const
{
return {{0, processText(data)}};
DocumentProcessResult result;
result.pages.append({0, processText(data)});
return result;
}

查看文件

@ -11,7 +11,7 @@ class DefaultTextProcessor : public Processor
public:
DefaultTextProcessor();
QString processText(const QByteArray &data) const;
QVector<PageData> process(const QByteArray &data) const override;
DocumentProcessResult process(const QByteArray &data) const override;
};
#endif // DEFAULTTEXTPROCESSOR_H

查看文件

@ -0,0 +1,31 @@
#include "documentoutlineentry.h"
DocumentOutlineEntry::DocumentOutlineEntry()
{
}
QDataStream &operator<<(QDataStream &out, const DocumentOutlineEntry &pd)
{
out << pd.text << pd.type << pd.destinationPage;
out << pd.children.size();
for(const DocumentOutlineEntry &entry : pd.children)
{
out << entry;
}
return out;
}
QDataStream &operator>>(QDataStream &in, DocumentOutlineEntry &pd)
{
in >> pd.text >> pd.type >> pd.destinationPage;
int numChildren;
in >> numChildren;
for(int i = 0; i < numChildren; i++)
{
DocumentOutlineEntry entry;
in >> entry;
pd.children.append(entry);
}
return in;
}

查看文件

@ -0,0 +1,29 @@
#ifndef DOCUMENTOUTLINEENTRY_H
#define DOCUMENTOUTLINEENTRY_H
#include <QMetaType>
#include <QDataStream>
#include <QString>
enum OutlineDestinationType
{
OUTLINE_DESTINATION_TYPE_NONE,
OUTLINE_DESTINATION_TYPE_PAGE
/* In the future, links, or #anchors are possible */
};
class DocumentOutlineEntry
{
public:
DocumentOutlineEntry();
QVector<DocumentOutlineEntry> children;
OutlineDestinationType type;
QString text;
unsigned int destinationPage;
};
Q_DECLARE_METATYPE(DocumentOutlineEntry);
QDataStream &operator<<(QDataStream &out, const DocumentOutlineEntry &pd);
QDataStream &operator>>(QDataStream &in, DocumentOutlineEntry &pd);
#endif // DOCUMENTOUTLINEENTRY_H

查看文件

@ -0,0 +1,39 @@
#include "documentprocessresult.h"
QDataStream &operator<<(QDataStream &out, const DocumentProcessResult &pd)
{
out << pd.pages.size();
out << pd.outlines.size();
for(const PageData &pd : pd.pages)
{
out << pd;
}
for(const DocumentOutlineEntry &outline : pd.outlines)
{
out << outline;
}
return out;
}
QDataStream &operator>>(QDataStream &in, DocumentProcessResult &pd)
{
int numPages, numOutlines;
in >> numPages;
in >> numOutlines;
for(int i = 0; i < numPages; i++)
{
PageData data;
in >> data;
pd.pages.append(data);
}
for(int i = 0; i < numOutlines; i++)
{
DocumentOutlineEntry outline;
in >> outline;
pd.outlines.append(outline);
}
return in;
}

查看文件

@ -0,0 +1,17 @@
#ifndef DOCUMENTPROCESSRESULT_H
#define DOCUMENTPROCESSRESULT_H
#include <pagedata.h>
#include <documentoutlineentry.h>
class DocumentProcessResult
{
public:
QVector<PageData> pages;
QVector<DocumentOutlineEntry> outlines;
};
Q_DECLARE_METATYPE(DocumentProcessResult);
QDataStream &operator<<(QDataStream &out, const DocumentProcessResult &pd);
QDataStream &operator>>(QDataStream &in, DocumentProcessResult &pd);
#endif // DOCUMENTPROCESSRESULT_H

查看文件

@ -110,7 +110,7 @@ int FileSaver::processFiles(const QVector<QString> paths, std::function<SaveFile
SaveFileResult FileSaver::saveFile(const QFileInfo &fileInfo)
{
QVector<PageData> pageData;
DocumentProcessResult processResult;
QString canonicalPath = fileInfo.canonicalFilePath();
int processorReturnCode = -1;
@ -169,11 +169,10 @@ SaveFileResult FileSaver::saveFile(const QFileInfo &fileInfo)
* finishes.
*/
QDataStream in(process.readAllStandardOutput());
while(!in.atEnd())
if(!in.atEnd())
{
PageData pd;
in >> pd;
pageData.append(pd);
in >> processResult;
}
processorReturnCode = process.exitCode();
if(processorReturnCode != OK && processorReturnCode != OK_WASEMPTY)
@ -185,7 +184,7 @@ SaveFileResult FileSaver::saveFile(const QFileInfo &fileInfo)
}
}
}
SaveFileResult result = this->dbService->saveFile(fileInfo, pageData, this->fileSaverOptions.metadataOnly);
SaveFileResult result = this->dbService->saveFile(fileInfo, processResult, this->fileSaverOptions.metadataOnly);
if(result == OK && processorReturnCode == OK_WASEMPTY)
{
return OK_WASEMPTY;

查看文件

@ -29,6 +29,11 @@ bool LooqsQuery::hasContentSearch() const
return (this->getTokensMask() & FILTER_CONTENT) == FILTER_CONTENT;
}
bool LooqsQuery::hasOutlineSearch() const
{
return (this->getTokensMask() & FILTER_OUTLINE_CONTAINS) == FILTER_OUTLINE_CONTAINS;
}
bool LooqsQuery::hasPathSearch() const
{
return (this->getTokensMask() & FILTER_PATH) == FILTER_PATH;
@ -289,6 +294,10 @@ LooqsQuery LooqsQuery::build(QString expression, TokenType loneWordsTokenType, b
{
tokenType = FILTER_TAG_ASSIGNED;
}
else if(filtername == "toc" || filtername == "outline")
{
tokenType = FILTER_OUTLINE_CONTAINS;
}
// TODO: given this is not really a "filter", this feels slightly misplaced here
else if(filtername == "sort")
{

查看文件

@ -68,6 +68,7 @@ class LooqsQuery
this->limit = limit;
}
bool hasContentSearch() const;
bool hasOutlineSearch() const;
bool hasPathSearch() const;
void addSortCondition(SortCondition sc);

2
shared/migrations/6.sql Normal file
查看文件

@ -0,0 +1,2 @@
CREATE TABLE outline(id INTEGER PRIMARY KEY, fileid INTEGER REFERENCES file (id) ON DELETE CASCADE, text varchar(1024), page integer);
CREATE INDEX outline_fileid ON outline (fileid);

查看文件

@ -5,5 +5,6 @@
<file>3.sql</file>
<file>4.sql</file>
<file>5.sql</file>
<file>6.sql</file>
</qresource>
</RCC>

查看文件

@ -10,7 +10,7 @@ class NothingProcessor : public Processor
NothingProcessor();
public:
QVector<PageData> process(const QByteArray & /*data*/) const override
DocumentProcessResult process(const QByteArray & /*data*/) const override
{
return {};
}

查看文件

@ -3,12 +3,12 @@
#include "odtprocessor.h"
#include "tagstripperprocessor.h"
QVector<PageData> OdtProcessor::process(const QByteArray & /*data*/) const
DocumentProcessResult OdtProcessor::process(const QByteArray & /*data*/) const
{
throw LooqsGeneralException("Not implemented yet");
}
QVector<PageData> OdtProcessor::process(QString path) const
DocumentProcessResult OdtProcessor::process(QString path) const
{
QuaZipFile zipFile(path);
zipFile.setFileName("content.xml");

查看文件

@ -8,9 +8,9 @@ class OdtProcessor : public Processor
{
this->PREFERED_DATA_SOURCE = FILEPATH;
}
QVector<PageData> process(const QByteArray &data) const override;
DocumentProcessResult process(const QByteArray &data) const override;
QVector<PageData> process(QString path) const override;
DocumentProcessResult process(QString path) const override;
};
#endif // ODTPROCESSOR_H

查看文件

@ -5,9 +5,30 @@ PdfProcessor::PdfProcessor()
{
}
QVector<PageData> PdfProcessor::process(const QByteArray &data) const
QVector<DocumentOutlineEntry> PdfProcessor::createOutline(const QVector<Poppler::OutlineItem> &outlineItems) const
{
QVector<PageData> result;
QVector<DocumentOutlineEntry> result;
for(const Poppler::OutlineItem &outlineItem : outlineItems)
{
DocumentOutlineEntry documentOutlineEntry;
documentOutlineEntry.text = outlineItem.name();
documentOutlineEntry.type = OUTLINE_DESTINATION_TYPE_PAGE;
if(!outlineItem.destination().isNull())
{
documentOutlineEntry.destinationPage = outlineItem.destination()->pageNumber();
}
if(outlineItem.hasChildren())
{
documentOutlineEntry.children = createOutline(outlineItem.children());
}
result.append(documentOutlineEntry);
}
return result;
}
DocumentProcessResult PdfProcessor::process(const QByteArray &data) const
{
DocumentProcessResult result;
QScopedPointer<Poppler::Document> doc(Poppler::Document::loadFromData(data));
if(doc.isNull())
{
@ -26,12 +47,13 @@ QVector<PageData> PdfProcessor::process(const QByteArray &data) const
for(auto i = 0; i < pagecount; i++)
{
QString text = doc->page(i)->text(entirePage);
result.append({static_cast<unsigned int>(i + 1), text});
result.pages.append({static_cast<unsigned int>(i + 1), text});
/*TODO: hack, so we can fts search several words over the whole document, not just pages.
* this of course uses more space and should be solved differently.
*/
entire += text;
}
result.append({0, entire});
result.pages.append({0, entire});
result.outlines = createOutline(doc->outline());
return result;
}

查看文件

@ -1,5 +1,6 @@
#ifndef PDFPROCESSOR_H
#define PDFPROCESSOR_H
#include <poppler-qt5.h>
#include "processor.h"
class PdfProcessor : public Processor
{
@ -7,7 +8,8 @@ class PdfProcessor : public Processor
PdfProcessor();
public:
QVector<PageData> process(const QByteArray &data) const override;
QVector<DocumentOutlineEntry> createOutline(const QVector<Poppler::OutlineItem> &outlineItems) const;
DocumentProcessResult process(const QByteArray &data) const override;
};
#endif // PDFPROCESSOR_H

查看文件

@ -2,8 +2,8 @@
#define PROCESSOR_H
#include <QVector>
#include <QFile>
#include "pagedata.h"
#include "utils.h"
#include "documentprocessresult.h"
enum DataSource
{
FILEPATH,
@ -18,8 +18,8 @@ class Processor
* a single file */
DataSource PREFERED_DATA_SOURCE = ARRAY;
Processor();
virtual QVector<PageData> process(const QByteArray &data) const = 0;
virtual QVector<PageData> process(QString path) const
virtual DocumentProcessResult process(const QByteArray &data) const = 0;
virtual DocumentProcessResult process(QString path) const
{
return process(Utils::readFile(path));
}

查看文件

@ -65,18 +65,12 @@ void SandboxedProcessor::enableSandbox(QString readablePath)
exile_free_policy(policy);
}
void SandboxedProcessor::printResults(const QVector<PageData> &pageData)
void SandboxedProcessor::printResults(const DocumentProcessResult &result)
{
QFile fsstdout;
fsstdout.open(stdout, QIODevice::WriteOnly);
QDataStream stream(&fsstdout);
for(const PageData &data : pageData)
{
stream << data;
// fsstdout.flush();
}
stream << result;
fsstdout.close();
}
@ -102,7 +96,7 @@ SaveFileResult SandboxedProcessor::process()
return OK;
}
QVector<PageData> pageData;
DocumentProcessResult processResult;
QString absPath = fileInfo.absoluteFilePath();
try
@ -111,13 +105,13 @@ SaveFileResult SandboxedProcessor::process()
{
/* Read access to FS needed... doh..*/
enableSandbox(absPath);
pageData = processor->process(absPath);
processResult = processor->process(absPath);
}
else
{
QByteArray data = Utils::readFile(absPath);
enableSandbox();
pageData = processor->process(data);
processResult = processor->process(data);
}
}
catch(LooqsGeneralException &e)
@ -126,6 +120,6 @@ SaveFileResult SandboxedProcessor::process()
return PROCESSFAIL;
}
printResults(pageData);
return pageData.isEmpty() ? OK_WASEMPTY : OK;
printResults(processResult);
return processResult.pages.isEmpty() ? OK_WASEMPTY : OK;
}

查看文件

@ -2,7 +2,7 @@
#define SANDBOXEDPROCESSOR_H
#include <QString>
#include <QMimeDatabase>
#include "pagedata.h"
#include "documentprocessresult.h"
#include "savefileresult.h"
class SandboxedProcessor
@ -12,7 +12,7 @@ class SandboxedProcessor
QMimeDatabase mimeDatabase;
void enableSandbox(QString readablePath = "");
void printResults(const QVector<PageData> &pageData);
void printResults(const DocumentProcessResult &pageData);
public:
SandboxedProcessor(QString filepath)

查看文件

@ -42,6 +42,8 @@ SOURCES += sqlitesearch.cpp \
dbmigrator.cpp \
defaulttextprocessor.cpp \
dirscanworker.cpp \
documentoutlineentry.cpp \
documentprocessresult.cpp \
encodingdetector.cpp \
filesaver.cpp \
filescanworker.cpp \
@ -72,6 +74,8 @@ HEADERS += sqlitesearch.h \
dbmigrator.h \
defaulttextprocessor.h \
dirscanworker.h \
documentoutlineentry.h \
documentprocessresult.h \
encodingdetector.h \
filedata.h \
filesaver.h \

查看文件

@ -142,6 +142,27 @@ QVector<QString> SqliteDbService::getTagsForPath(QString path)
return result;
}
QVector<QString> SqliteDbService::getPathsForTag(QString tag)
{
QVector<QString> result;
auto query = QSqlQuery(dbFactory->forCurrentThread());
query.prepare(
"SELECT file.path FROM tag INNER JOIN filetag ON tag.id = filetag.tagid INNER JOIN file ON filetag.fileid "
"= file.id WHERE tag.name = ?");
query.addBindValue(tag.toLower());
query.setForwardOnly(true);
if(!query.exec())
{
throw LooqsGeneralException("Error while trying to retrieve paths from database: " + query.lastError().text());
}
while(query.next())
{
QString path = query.value(0).toString();
result.append(path);
}
return result;
}
bool SqliteDbService::setTags(QString path, const QSet<QString> &tags)
{
QSqlDatabase db = dbFactory->forCurrentThread();
@ -232,6 +253,29 @@ bool SqliteDbService::insertToFTS(bool useTrigrams, QSqlDatabase &db, int fileid
return true;
}
bool SqliteDbService::insertOutline(QSqlDatabase &db, int fileid, const QVector<DocumentOutlineEntry> &outlines)
{
QSqlQuery outlineQuery(db);
outlineQuery.prepare("INSERT INTO outline(fileid, text, page) VALUES(?,?,?)");
outlineQuery.addBindValue(fileid);
for(const DocumentOutlineEntry &outline : outlines)
{
outlineQuery.bindValue(1, outline.text.toLower());
outlineQuery.bindValue(2, outline.destinationPage);
if(!outlineQuery.exec())
{
Logger::error() << "Failed outline insertion " << outlineQuery.lastError() << Qt::endl;
return false;
}
if(!insertOutline(db, fileid, outline.children))
{
Logger::error() << "Failed outline insertion (children)) " << outlineQuery.lastError() << Qt::endl;
return false;
}
}
return true;
}
QSqlQuery SqliteDbService::exec(QString querystr, std::initializer_list<QVariant> args)
{
auto query = QSqlQuery(dbFactory->forCurrentThread());
@ -257,7 +301,7 @@ bool SqliteDbService::execBool(QString querystr, std::initializer_list<QVariant>
return query.value(0).toBool();
}
SaveFileResult SqliteDbService::saveFile(QFileInfo fileInfo, QVector<PageData> &pageData, bool pathsOnly)
SaveFileResult SqliteDbService::saveFile(QFileInfo fileInfo, DocumentProcessResult &processResult, bool pathsOnly)
{
QString absPath = fileInfo.absoluteFilePath();
auto mtime = fileInfo.lastModified().toSecsSinceEpoch();
@ -302,18 +346,24 @@ SaveFileResult SqliteDbService::saveFile(QFileInfo fileInfo, QVector<PageData> &
if(!pathsOnly)
{
int lastid = inserterQuery.lastInsertId().toInt();
if(!insertToFTS(false, db, lastid, pageData))
if(!insertToFTS(false, db, lastid, processResult.pages))
{
db.rollback();
Logger::error() << "Failed to insert data to FTS index " << Qt::endl;
return DBFAIL;
}
if(!insertToFTS(true, db, lastid, pageData))
if(!insertToFTS(true, db, lastid, processResult.pages))
{
db.rollback();
Logger::error() << "Failed to insert data to FTS index " << Qt::endl;
return DBFAIL;
}
if(!insertOutline(db, lastid, processResult.outlines))
{
db.rollback();
Logger::error() << "Failed to insert outline data " << Qt::endl;
return DBFAIL;
}
}
if(!db.commit())
@ -379,3 +429,68 @@ bool SqliteDbService::addTag(QString tag, const QVector<QString> &paths)
return true;
}
bool SqliteDbService::removePathsForTag(QString tag, const QVector<QString> &paths)
{
QSqlDatabase db = dbFactory->forCurrentThread();
QSqlQuery tagQuery(db);
QSqlQuery fileTagQuery(db);
tag = tag.toLower();
fileTagQuery.prepare(
"DELETE FROM filetag WHERE fileid = (SELECT id FROM file WHERE path = ?) AND tagid = (SELECT id "
"FROM tag WHERE name = ?)");
fileTagQuery.bindValue(1, tag);
for(const QString &path : paths)
{
fileTagQuery.bindValue(0, path);
if(!fileTagQuery.exec())
{
Logger::error() << "An error occured while trying to remove paths from tag assignment" << Qt::endl;
return false;
}
}
return true;
}
bool SqliteDbService::deleteTag(QString tag)
{
QSqlDatabase db = dbFactory->forCurrentThread();
if(!db.transaction())
{
Logger::error() << "Failed to open transaction while trying to delete tag " << tag << " : " << db.lastError()
<< Qt::endl;
return false;
}
tag = tag.toLower();
QSqlQuery assignmentDeleteQuery(db);
assignmentDeleteQuery.prepare("DELETE FROM filetag WHERE tagid = (SELECT id FROM tag WHERE name = ?)");
assignmentDeleteQuery.addBindValue(tag);
if(!assignmentDeleteQuery.exec())
{
db.rollback();
Logger::error() << "Error while trying to delete tag: " << db.lastError() << Qt::endl;
return false;
}
QSqlQuery deleteTagQuery(db);
deleteTagQuery.prepare("DELETE FROM tag WHERE name = ?");
deleteTagQuery.addBindValue(tag);
if(!deleteTagQuery.exec())
{
db.rollback();
Logger::error() << "Error while trying to delete tag: " << db.lastError() << Qt::endl;
return false;
}
if(!db.commit())
{
db.rollback();
Logger::error() << "Error while trying to delete tag: " << db.lastError() << Qt::endl;
return false;
}
return true;
}

查看文件

@ -5,7 +5,7 @@
#include "databasefactory.h"
#include "utils.h"
#include "pagedata.h"
#include "documentprocessresult.h"
#include "filedata.h"
#include "../shared/sqlitesearch.h"
#include "../shared/token.h"
@ -22,7 +22,7 @@ class SqliteDbService
public:
SqliteDbService(DatabaseFactory &dbFactory);
SaveFileResult saveFile(QFileInfo fileInfo, QVector<PageData> &pageData, bool pathsOnly);
SaveFileResult saveFile(QFileInfo fileInfo, DocumentProcessResult &pageData, bool pathsOnly);
bool deleteFile(QString path);
bool fileExistsInDatabase(QString path);
@ -34,11 +34,15 @@ class SqliteDbService
bool addTag(QString tag, const QVector<QString> &paths);
QVector<QString> getTags();
QVector<QString> getTagsForPath(QString path);
QVector<QString> getPathsForTag(QString path);
bool setTags(QString path, const QSet<QString> &tags);
bool removePathsForTag(QString tag, const QVector<QString> &paths);
bool deleteTag(QString tag);
QVector<SearchResult> search(const LooqsQuery &query);
std::optional<QChar> queryFileType(QString absPath);
bool insertOutline(QSqlDatabase &db, int fileid, const QVector<DocumentOutlineEntry> &outlines);
};
#endif // SQLITEDBSERVICE_H

查看文件

@ -148,6 +148,11 @@ QPair<QString, QVector<QString>> SqliteSearch::createSql(const Token &token)
return {" file.id IN (SELECT fileid FROM filetag WHERE tagid = (SELECT id FROM tag WHERE name = ?)) ",
{value.toLower()}};
}
if(token.type == FILTER_OUTLINE_CONTAINS)
{
return {" outline.text LIKE '%' || ? || '%' ", {value.toLower()}};
}
throw LooqsGeneralException("Unknown token passed (should not happen)");
}
@ -156,6 +161,7 @@ QSqlQuery SqliteSearch::makeSqlQuery(const LooqsQuery &query)
QString whereSql;
QVector<QString> bindValues;
bool isContentSearch = (query.getTokensMask() & FILTER_CONTENT) == FILTER_CONTENT;
bool isOutlineSearch = query.hasOutlineSearch();
if(query.getTokens().isEmpty())
{
throw LooqsGeneralException("Nothing to search for supplied");
@ -200,15 +206,22 @@ QSqlQuery SqliteSearch::makeSqlQuery(const LooqsQuery &query)
}
else
{
QString pageColumn = "'0' as page";
QString joiners = "";
if(isOutlineSearch)
{
pageColumn = "outline.page as page";
joiners = " INNER JOIN outline ON outline.fileid = file.id ";
}
if(sortSql.isEmpty())
{
sortSql = "ORDER BY file.mtime DESC";
}
prepSql = "SELECT file.path AS path, '0' as page, file.mtime AS mtime, file.size AS size, file.filetype AS "
"filetype FROM file WHERE 1=1 AND " +
whereSql + " " + sortSql;
prepSql = "SELECT DISTINCT file.path AS path, " + pageColumn +
",file.mtime AS mtime, file.size AS size, "
"file.filetype AS filetype FROM file" +
joiners + " WHERE 1=1 AND " + whereSql + " " + sortSql;
}
if(query.getLimit() > 0)
{
prepSql += " LIMIT " + QString::number(query.getLimit());
@ -242,7 +255,7 @@ QVector<SearchResult> SqliteSearch::search(const LooqsQuery &query)
throw LooqsGeneralException("SQL Error: " + dbQuery.lastError().text());
}
bool contentSearch = query.hasContentSearch();
bool contentSearch = query.hasContentSearch() || query.hasOutlineSearch();
while(dbQuery.next())
{
SearchResult result;

查看文件

@ -28,6 +28,31 @@ bool TagManager::removeTagsForPath(QString path, const QSet<QString> &tags)
return this->dbService->setTags(path, newTags);
}
bool TagManager::removePathsForTag(QString tag, const QVector<QString> &paths)
{
return this->dbService->removePathsForTag(tag, paths);
}
bool TagManager::deleteTag(QString tag)
{
return this->dbService->deleteTag(tag);
}
QVector<QString> TagManager::getTags(QString path)
{
return this->dbService->getTagsForPath(path);
}
QVector<QString> TagManager::getTags()
{
return this->dbService->getTags();
}
QVector<QString> TagManager::getPaths(QString tag)
{
return this->dbService->getPathsForTag(tag);
}
bool TagManager::addTagsToPath(QString path, QString tagstring, QChar delim)
{
auto splitted = tagstring.split(delim);

查看文件

@ -17,9 +17,11 @@ class TagManager
bool addPathsToTag(QString tag, const QVector<QString> &paths);
bool removeTagsForPath(QString path, const QSet<QString> &tags);
bool removePathsForTag(QString tag, const QVector<QString> &paths);
bool deleteTag(QString tag);
QVector<QString> getTags(QString path);
QVector<QString> getTags();
QVector<QString> getPaths(QString tag);
};

查看文件

@ -4,11 +4,11 @@ TagStripperProcessor::TagStripperProcessor()
{
}
QVector<PageData> TagStripperProcessor::process(const QByteArray &data) const
DocumentProcessResult TagStripperProcessor::process(const QByteArray &data) const
{
auto result = DefaultTextProcessor::process(data);
// TODO: does not work properly with <br> and does not deal with entities...
result[0].content.remove(QRegExp("<[^>]*>"));
Q_ASSERT(result.pages.size() > 0);
result.pages[0].content.remove(QRegExp("<[^>]*>"));
return result;
}

查看文件

@ -8,7 +8,7 @@ class TagStripperProcessor : public DefaultTextProcessor
TagStripperProcessor();
public:
QVector<PageData> process(const QByteArray &data) const override;
DocumentProcessResult process(const QByteArray &data) const override;
};
#endif // XMLSTRIPPERPROCESSOR_H

查看文件

@ -19,10 +19,11 @@ enum TokenType
FILTER_PATH_SIZE,
FILTER_PATH_ENDS,
FILTER_PATH_STARTS,
FILTER_CONTENT = 512,
FILTER_TAG_ASSIGNED,
FILTER_OUTLINE_CONTAINS,
FILTER_CONTENT = 512, /* Everything below here is content search (except LIMIT) */
FILTER_CONTENT_CONTAINS,
FILTER_CONTENT_PAGE,
FILTER_TAG_ASSIGNED,
LIMIT = 1024
};