Market Overview

Look Under the Hood of Global File Search Solutions

Share:

Experts at Cloudtenna Offer Tips for Success in Integrating GFS
Software

With enterprises and OEMs looking to integrate search tools into their
own products, Enterprise File Search (EFS) has emerged as one of today's
hottest technology trends – with several recent high-profile IPOs and
multimillion-dollar venture capital investments. However, a new
iteration called Global File Search (GFS) may quickly surpass it,
according to experts at software startup Cloudtenna.

GFS solutions are designed to search on-premise repositories, such as
network servers and storage, email apps, cloud file services, and
popular hosted collaboration suites that store documents. But with
search technologies and architectures greatly varying, older search
solutions lack connectors to many of these user file repositories.
Enterprises and vendors evaluating GFS solutions should look closely at
these three important criteria.

Speed-Accuracy Tradeoff

How fast are search results returned after submitting a query? Seconds?
Minutes? A good experience requires search in under a second. And how
quickly are file permissions updates reflected in search results? It's
critical that users only see search results for files they have access
permission to view.

All GFS software scan and index connected repositories to create its own
file reference database. Access permissions, commonly referred to as
ACLs, are then applied to the index to ensure each user only sees file
results they have permission to view. The speed at which this takes
place depends on the GFS software's underlying technology. The two
common approaches have been "query-time binding," which enforces
permissions at the time a user performs their search, and "early
binding," which pre-processes file permissions according to a set
schedule. Cloudtenna is introducing a new approach called "real-time
binding," which builds its index and then performs consistency checks so
any deltas are captured at the time a security change is made.

These three fundamentally different approaches deliver varying speeds
and quality of results. Query-time binding ensures that security
permissions are enforced in real-time, but it suffers from very long
latency that significantly slows search results. Early binding trades
off security to deliver a positive search-time user experience, but will
frequently return results that don't reflect the up-to-the-minute file
permissions, compromising file security. Real-time binding achieves
speeds as fast as early binding but maintains accuracy because it works
continually in the background.

Security Concerns

GFS software tools need to approach security and access control
differently than EFS in order to return a list of files the specific
searcher is authorized to view. After files are scanned and indexed, the
GFS tool understands the organization's access control structures. If
the software uses early binding, file permissions may be out of date by
as much as a week. This means users can find and access files they are
not allowed to view.

On the other hand, query-time binding, while inefficient and cumbersome,
maintains ACLs and permissions for security because it performs lengthy
system-intensive join operations to apply the file permission at the
time of query.

GFS solutions that use real-time binding keep indexes updated with the
latest ACLs to accommodate changes in file permissions as they happen,
such as when an executive leaves the company. Real-time binding requires
machine learning to match the speeds necessary to run continuously and
ensure an always up-to-date permissions map.

Scalability

Several GFS options break down at scale based on how they are built.
They attempt to mask that architectural limitation by capping the number
of files they can accommodate per software instance. This can be
acceptable to midsized organizations or departments with fewer than
200,000 files, or those using GFS as a point solution for a single
repository such as a custom-built search function on a website.
Enterprise organizations with considerably more files will find the
costs untenable. More licenses, management, compute hardware, supporting
infrastructure, and/or virtual compute instances add up rapidly. These
types of GFS licensing may be per-seat for enterprise customers, but
there are also the upfront and ongoing costs incurred in integration,
especially in the case of OEM partners.

Aside from user and file limitations, many GFS systems are subject to
repository limits. Most accommodate local machines and on-premise
network shares in filers and NAS; fewer work across file sync-and-share
services and clouds (Google Drive, Box, Dropbox, and Microsoft
OneDrive). GFS should also search files in email applications (Outlook
and Gmail) and SaaS applications (including Salesforce, Slack, Jira, or
Confluence).

"GFS solutions must be built and integrated properly based on each
organization's or OEM's individual requirements," said Aaron Ganek,
Cloudtenna CEO. "In a modern enterprise with thousands of employees and
millions of files across dozens of repositories, data management and
security are complex challenges that GFS solutions can alleviate or
aggravate depending on their architectures."

Cloudtenna's DirectSearch™ works universally across on-premise
repositories, cloud file storage services, and hosted/online
applications. The search-once-and-done tool can find files by name,
sender, date, file type, keyword, content, and other attributes
regardless of where it is stored. DirectSearch uses machine learning
intelligence, natural language processing, and automation to deliver
relevant results and rankings fast – in 400-600 milliseconds.

Follow Cloudtenna

Twitter
Facebook
LinkedIn

About Cloudtenna

Cloudtenna was founded to bring order to file chaos with a suite of
AI-powered applications for file management. Cloudtenna's team has
decades of experience in both enterprise infrastructure and cloud file
management services at leading companies including Rhapsody Networks,
Oxygen Cloud, Symantec, Sun Microsystems, NetApp, EMC, Fusion.io, and
VERITAS. The team has developed over 20 successful OEM programs from the
ground up. Its executives are complemented by engineers who have made
key contributions to the NetApp WAFL and VxFS code bases, among other
file systems. Together, the Cloudtenna team is revolutionizing how
people work with files inside the enterprise with the next generation of
file management, file analytics, auditing, and governance. For more
information visit www.cloudtenna.com.

View Comments and Join the Discussion!