Language selection

Search

Patent 3069813 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 3069813
(54) English Title: CAPTURING, CONNECTING AND USING BUILDING INTERIOR DATA FROM MOBILE DEVICES
(54) French Title: CAPTURE, CONNEXION ET UTILISATION DE DONNEES D'INTERIEUR DE BATIMENT A PARTIR DE DISPOSITIFS MOBILES
Status: Granted
Bibliographic Data
(51) International Patent Classification (IPC):
  • H04N 7/18 (2006.01)
  • H04W 64/00 (2009.01)
  • H04W 4/021 (2018.01)
  • G06F 30/13 (2020.01)
  • G06Q 50/16 (2012.01)
(72) Inventors :
  • SHAN, QI (United States of America)
  • COLBURN, ALEX (United States of America)
  • GUAN, LI (United States of America)
  • BOYADZHIEV, IVAYLO (United States of America)
(73) Owners :
  • MFTB HOLDCO, INC. (United States of America)
(71) Applicants :
  • ZILLOW GROUP, INC. (United States of America)
(74) Agent: GOWLING WLG (CANADA) LLP
(74) Associate agent:
(45) Issued: 2021-07-06
(86) PCT Filing Date: 2018-07-13
(87) Open to Public Inspection: 2019-01-17
Examination requested: 2020-01-13
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2018/042130
(87) International Publication Number: WO2019/014620
(85) National Entry: 2020-01-13

(30) Application Priority Data:
Application No. Country/Territory Date
15/649,427 United States of America 2017-07-13
15/649,434 United States of America 2017-07-13

Abstracts

English Abstract


Techniques are described for automated operations involving acquiring and
analyzing information from an interior of
a house, building or other structure, for use in generating and providing a
representation of that interior. Such techniques may include
using a user's mobile device to capture video data from multiple viewing
locations (e.g., 360° video at each viewing location) within
multiple rooms, and capturing data linking the multiple viewing locations
(e.g., by recording video, acceleration and/or other data
from the mobile device as the user moves between the two viewing locations),
creating a panorama image for each viewing location,
analyzing the linking information to model the user's travel path and
determine relative positions/directions between at least some
viewing locations, creating inter-panorama links in the panoramas to each of
one or more other panoramas based on such determined
positions/directions, and providing information to display multiple linked
panorama images to represent the interior.



French Abstract

La présente invention concerne des techniques pour des opérations automatisées consistant à acquérir et analyser des informations à partir de l'intérieur d'une maison, d'un bâtiment ou d'une autre structure, afin de les utiliser pour générer et fournir une représentation de cet intérieur. De telles techniques peuvent comprendre l'utilisation d'un dispositif mobile d'utilisateur pour capturer des données vidéo à partir de multiples emplacements de visualisation (par ex., une vidéo à 360° à chaque emplacement de visualisation) à l'intérieur de multiples pièces, et la capture de données reliant les multiples emplacements de visualisation (par ex., en enregistrant une vidéo, une accélération et/ou d'autres données à partir du dispositif mobile lorsque l'utilisateur se déplace entre les deux emplacements de visualisation), la création d'une image panoramique pour chaque emplacement de visualisation, l'analyse des informations de liaison pour modéliser le trajet de déplacement de l'utilisateur et déterminer des positions/directions relatives entre au moins certains emplacements de visualisation, la création de liaisons inter-panoramiques dans les panoramiques avec chacun des autres panoramiques en se basant sur de telles positions/directions déterminées, et la fourniture des informations pour afficher de multiples images panoramiques reliées pour représenter l'intérieur.

Claims

Note: Claims are shown in the official language in which they were submitted.


CLAIMS
I/We claim:
1. A computer-implemented method comprising:
obtaining, by at least one device, linking information for a sequence of
viewing locations within an interior of a building, including video
information and
acceleration data generated by a mobile device as a user carries the mobile
device
between each successive pair of viewing locations in the sequence;
determining, by the at least one device, and for each successive pair of
viewing locations in the sequence, relative positional information that
includes at
least a direction from a starting viewing location of the successive pair to
an ending
viewing location of the successive pair, including analyzing the acceleration
data of
the linking information to model a travel path of the user from the starting
viewing
location to the ending viewing location of the successive pair and using the
modeled
travel path as part of determining the direction;
generating, by the at least one device, and using the determined direction for

each successive pair of viewing locations in the sequence, information to link

panorama images for the viewing locations, wherein each of the panorama images

is associated with one of the viewing locations and has views from the one
viewing
location in each of multiple directions, and wherein the generating includes,
for each
of the viewing locations other than a last viewing location in the sequence,
generating an inter-panorama link for the panorama image associated with the
viewing location that points toward a next viewing location in the sequence;
and
providing, by the at least one device and for display on a client device,
information about the interior of the building that includes the panorama
images and
that includes the generated inter-panorama links for the panorama images.
2. The computer-implemented method of claim 1 wherein the
determining of the relative positional information for one of the successive
pairs of
67
Date Recue/Date Received 2020-11-06

viewing locations in the sequence further includes determining a departure
direction
from the starting viewing location of the one successive pair by analyzing a
first
portion of the video information that corresponds to that starting viewing
location,
and determining an arrival direction at the ending viewing location of the one

successive pair by analyzing a second portion of the video information that
corresponds to that ending viewing location, and wherein modeling of the
travel path
for the one successive pair is further based in part on the determined
departure
direction and on the determined arrival direction.
3. The computer-implemented method of claim 2 wherein the
determining of the relative positional information for the one successive pair
further
includes determining a second direction from the ending viewing location of
the one
successive pair to the starting viewing location of the one successive pair,
and
wherein the generating of the relative positional information further includes
using
the determined second direction to generate an inter-panorama link that points

toward the panorama image associated with the starting viewing location of the
one
successive pair from the panorama image for the ending viewing location of the
one
successive pair.
4. The computer-implemented method of claim 2 wherein the analyzing
of the acceleration data of the linking information for the one successive
pair further
includes performing a double integration operation on each of multiple data
points
of the acceleration data to determine velocity and location information for
the data
point, and combining the determined velocity and location information for the
multiple data points to construct at least some of the travel path of the user
for the
one successive pair.
5. The computer-implemented method of claim 4 wherein the analyzing
of the acceleration data of the linking information for the one successive
pair further
includes modeling bias from one or more sensors of the mobile device that
provide
68
Date Recue/Date Received 2020-11-06

the acceleration data, and using the modeled bias as part of performing the
double
integration operation.
6. The computer-implemented method of claim 4 wherein the
determining of the relative positional information for the one successive pair
further
includes generating a confidence value for the direction from the starting
viewing
location of the one successive pair to the ending viewing location of the one
successive pair, and determining to use the direction from the starting
viewing
location of the one successive pair to the ending viewing location of the one
successive pair based at least in part on the generated confidence value
exceeding
a defined threshold.
7. The computer-implemented method of claim 6 wherein the generating
of the confidence value is based at least in part on a length of the travel
path for the
one successive pair, on a straightness of the travel path for the one
successive pair,
on matching of the first portion of the video information to the panorama
image for
the starting viewing location of the one successive pair, and on matching of
the
second portion of the video information to the panorama image for the ending
viewing location of the one successive pair.
8. The computer-implemented method of claim 1 wherein the
determining of the relative positional information for one of the successive
pairs of
viewing locations in the sequence further includes identifying first images
taken from
the panorama image for the starting viewing location of the one successive
pair that
have features matching second images taken from the panorama image for the
ending viewing location of the one successive pair, and using information from
the
first images and the second images as part of determining the direction from
the
starting viewing location of the one successive pair to the ending viewing
location of
the one successive pair.
69
Date Recue/Date Received 2020-11-06

9. The computer-implemented method of claim 8 wherein the using of
the information from the first images and the second images as part of
determining
the direction further includes generating an essential matrix for each of
multiple pairs
of images from the first and second images, determining a value for the
direction for
each of the multiple pairs based on the essential matrix for the pair, and
generating
an aggregate value for the direction from the values determined for the
multiple
pairs.
10. The computer-implemented method of claim 8 wherein the using of
the information from the first images and the second images as part of
determining
the direction further includes generating a homography matrix for each of
multiple
pairs of images from the first and second images, determining a value for the
direction for each of the multiple pairs based on the homography matrix for
the pair,
and generating an aggregate value for the direction from the values determined
for
the multiple pairs.
11. The computer-implemented method of claim 8 wherein the using of
the information from the first images and the second images as part of
determining
the direction further includes determining, for each of multiple pairs of
images from
the first and second images, whether to use an essential matrix or a
homography
matrix to determine a value for the direction for the pair.
12. The computer-implemented method of claim 8 wherein the using of
the information from the first images and the second images as part of
determining
the direction further includes determining, for each of multiple pairs of
images from
the first and second images, and by using reprojection, that a degree of match

between images of the pair exceeds a defined threshold before further using
the
pair to determine a value for the direction.
Date Recue/Date Received 2020-11-06

13. The computer-implemented method of claim 8 wherein the using of
the information from the first images and the second images as part of
determining
the direction further includes generating a confidence value based on the
information from the first images and the second images, and determining to
use
the direction from the starting viewing location of the one successive pair to
the
ending viewing location of the one successive pair based at least in part on
the
generated confidence value exceeding a defined threshold.
14. The computer-implemented method of claim 13 wherein the
generating of the confidence value is based at least in part on a number of
peaks in
an aggregate consensus distribution generated using multiple pairs of images
from
the first and second images, and on circular coverage of samples in at least
one of
the panorama image for the starting viewing location of the one successive
pair or
the panorama image for the ending viewing location of the one successive pair.
15. The computer-implemented method of claim 1 wherein the generating
of the information to link the panorama images for the viewing locations
further
includes modifying one or more of the directions from the starting viewing
locations
of each successive pair to the ending viewing locations of each successive
pair by
performing a global optimization on the directions, and wherein the generating
of at
least one inter-panorama link is based on at least one of the modified
directions.
16. The computer-implemented method of claim 1 further comprising:
recording, by the mobile device, videos of the interior from each of the
viewing locations as the user turns at the viewing locations; and
creating, by the at least one device and from the recorded videos, the
panorama images for the viewing locations.
17. The computer-implemented method of claim 1 wherein the providing
of the information includes receiving an indication from the client device of
a user
71
Date Recue/Date Received 2020-11-06

selection of one of the included generated inter-panorama links during a
display of
one of the panorama images, and causing, based at least in part on the user
selection of the one included generated inter-panorama link, the client device
to
display another of the panorama images that is associated with the one
included
generated inter-panorama link.
18. The computer-implemented method of claim 1 wherein the at least one
device includes the mobile device, such that the determining of the relative
positional information, the generating of the information to link the panorama

images, and the providing of the information are performed at least in part by
the
mobile device.
19. The computer-implemented method of claim 1 wherein the at least one
device includes one or more server computing devices that are located remotely

from the building and that receive via one or more intervening computer
networks
data recorded by the mobile device including the linking information and
additional
representing the sequence of viewing locations, such that one or more of the
determining of the relative positional information, the generating of the
information
to link the panorama images, and the providing of the information are
performed at
least in part by the one or more server computing devices.
20. A non-transitory computer-readable medium having stored contents
that cause one or more devices to perform automated operations, the automated
operations including:
obtaining linking information for a sequence of viewing locations within an
interior of a structure, including acceleration data generated by a device as
the
device moves between each successive pair of viewing locations in the
sequence;
determining, by the one or more devices, and for one or more pairs of
successive viewing locations in the sequence, relative positional information
that
includes at least a direction from a starting viewing location of the pair to
an ending
72
Date Recue/Date Received 2020-11-06

viewing location of the pair, including analyzing the acceleration data of the
linking
information to model a travel path of the device from the starting viewing
location to
the ending viewing location and using the modeled travel path as part of
determining
the direction;
generating, by the one or more devices, and using the determined direction
for each of the one or more pairs of successive viewing locations, information
to link
visual information associated with the starting viewing location for the pair
to visual
information associated with the ending viewing location for the pair,
including
generating a link for the visual information associated with the starting
viewing
location for the pair that points toward the ending viewing location for the
pair; and
providing, by the one or more devices, information for display about the
interior of the structure that includes the visual information associated with
the
starting viewing location of each pair, the visual information associated with
the
ending viewing location of each pair, and the generated link for each pair.
21. The non-transitory computer-readable medium of claim 20 wherein
the device is a mobile computing device associated with a user that
participates in
recording data for the sequence of viewing locations, wherein the visual
information
associated with the starting viewing location of each pair is a panorama image
from
that starting viewing location, wherein the visual information associated with
the
ending viewing location of each pair is a panorama image from that ending
viewing
location, and wherein the stored contents include software instructions that,
when
executed by the one or more devices, program the one or more devices to
perform
the automated operations.
22. A system, comprising:
one or more hardware processors of one or more computing devices; and
one or more memories with stored instructions that, when executed by at
least one of the one or more hardware processors, cause the one or more
computing devices to provide data representing a building, including:
73
Date Recue/Date Received 2020-11-06

obtaining data for a sequence of recording locations associated with
the building that include at least one recording location within an interior
of the
building, including data that is recorded at each of the recording locations,
and
linking information that includes acceleration data associated with movement
between successive recording locations in the sequence;
determining, based at least in part on analyzing the acceleration data,
and for one or more pairs of successive recording locations in the sequence, a

direction from a starting recording location of the pair to an ending
recording location
of the pair;
generating, using the determined direction for each of the one or more
pairs of successive recording locations, information to link the data recorded
at the
starting recording location for the pair to the data recorded for the ending
recording
location for the pair, including generating a link that points from the
starting recording
location for the pair toward the ending recording location for the pair; and
providing information for presentation about the interior of the building
that includes the data recorded at one or more of the recording locations and
that
includes at least one of the generated links.
23.
The system of claim 22 wherein, for two successive recording
locations in the sequence of recording locations, the data recorded for each
of the
successive recording locations includes recorded video, the obtaining of the
linking
information between the two successive recording locations further includes
capturing visual data associated with the movement between the two successive
recording locations, the stored instructions include software instructions
that further
cause the one or more computing devices to create a panorama image for each of

the successive recording locations, and the providing of the information
includes:
presenting, on a client device of an end user, the panorama image for a first
recording location of the two successive recording locations, including
presenting a
visual representation for a generated link to a second recording location of
the two
successive recording locations; and
74
Date Recue/Date Received 2020-11-06

presenting, on the client device and after selection of the visual
representation for the generated link, additional visual information that
includes the
panorama image for the second recording location.
24. The system of claim 22 wherein the obtaining of the data for the
sequence of recording locations further includes recording data for one or
more
recording locations external to the building to include data for at least some
of an
exterior of the building.
25. The system of claim 22 wherein the obtaining of the data for the
sequence of recording locations includes, for each of the at least some
recording
locations, capturing visual data at the recording location that includes
multiple
images captured in multiple directions from the recording location.
26. The system of claim 22 wherein the obtaining of the data for the
sequence of recording locations includes, for each of the at least some
recording
locations, capturing audio data at the recording location.
27. A computer-implemented method comprising:
generating, by a mobile device carried by a user, visual data representing
multiple locations within an interior of a house, including recording first
visual
information from a first viewing location within the house for a first
panorama image
having a 360-degree view around a vertical axis at the first viewing location,

capturing, as the user carries the mobile device along a travel path from the
first
viewing location to a second viewing location within the house, linking
information
between the first and second viewing locations that includes acceleration data

associated with movement of the mobile device along the travel path and that
includes multiple images along the travel path, and recording second visual
information from the second viewing location for a second panorama image
having
a 360-degree view around a vertical axis at the second viewing location;
Date Recue/Date Received 2020-11-06

automatically determining, based at least in part on analyzing the
acceleration data and the multiple images included in the captured linking
information to model the travel path of the user, relative positional
information that
includes direction information between the first and second viewing locations
and
that includes a distance between the first and second viewing locations;
automatically generating, using at least the determined direction information,

information to link the first and second panorama images, including generating
a
first inter-panorama link for the first panorama image to be displayed in a
direction
of the second viewing location and a second inter-panorama link for the second

panorama image to be displayed in a direction of the first viewing location;
and
presenting, using the first and second panorama images and the generated
first and second inter-panorama links, a display of the interior of the house
on a
client device that includes the first and second panorama images, including
displaying the generated first inter-panorama link in the first panorama image
in a
direction of the second panorama image that is selectable by the user to
display the
second panorama image, and displaying the generated second inter-panorama link

in the second panorama image in a direction of the first panorama image that
is
selectable by the user to display the first panorama image.
28.
The computer-implemented method of claim 27 wherein the capturing
of the linking information between the first and second viewing locations
includes
recording a third video as the user carries the mobile device from the first
viewing
location to the second viewing location, and wherein the automatic determining
of
the relative positional information is based in part on the recorded third
video.
29. The computer-implemented method of claim 28 wherein the
generating of the visual data representing the multiple locations within the
interior of
the house further includes, for each of multiple additional viewing locations
in the
house that are part of a sequence from the first viewing location to a final
viewing
location, and beginning with the second viewing location as a current viewing
location in the sequence:
76
Date Recue/Date Received 2020-11-06

capturing, as the user carries the mobile device from the current viewing
location in the sequence to a next viewing location in the sequence,
additional
linking information between the current and next viewing locations that
includes
acceleration data associated with movement of the mobile device;
recording an additional video of the interior of the house from the next
viewing
location that is captured as the user performs at least a full rotation around
a vertical
axis at the next viewing location, such that the next viewing location becomes
a new
current viewing location in the sequence; and
creating additional panorama images for the multiple additional viewing
locations,
and wherein the automatic determining of the relative positional information
includes determining relative positional information between each pair of
viewing
locations in the sequence, and
wherein the presenting includes presenting the additional panorama images
including additional displayed user-selectable links between at least some of
the
additional panorama images.
30. The computer-implemented method of claim 29 wherein the recording
of the additional video for each of the multiple additional viewing locations
includes
recording audio data, wherein one or more of the additional viewing locations
are
outside of the house, and wherein the additional video recorded at each of the
one
or more additional viewing locations includes a portion of an exterior of the
house.
31. A computer-implemented method comprising:
recording, by a mobile device carried by a user within an interior of a
building,
data representing a sequence of viewing locations within the interior,
including
recording videos of the interior from the viewing locations as the user turns
at the
viewing locations, and including capturing, using one or more inertial
measurement
units (IMUs) of the mobile device, linking information that includes
acceleration data
associated with movement of the mobile device as the user moves between each
successive pair of viewing locations in the sequence;
77
Date Recue/Date Received 2020-11-06

creating, by at least one device and from the recorded videos for the viewing
locations, panorama images that include a panorama image for each viewing
location with views from the viewing location for each of multiple directions;
generating, by the at least one device and based at least in part on analyzing

the acceleration data of the captured linking information, relative positional

information between at least some of the viewing locations; and
providing, by the at least one device and for display on a client device,
information about the interior of the building that includes the created
panorama
images and that includes user-selectable links between at least some of the
created
panorama images based on the generated relative positional information, to
cause
one or more of the user-selectable links to be displayed within each of the
created
panorama images to allow an end-user of the client device to change to another
of
the created panorama images upon selection of a corresponding user-selectable
link.
32. The computer-implemented method of claim 31 wherein the capturing
of the linking information between each successive pair of viewing locations
in the
sequence includes recording additional video as the user moves between the
viewing locations of the successive pair, and wherein the generating of the
relative
positional information is based in part on information from the additional
video.
33. The computer-implemented method of claim 32 wherein the recording
of the additional video for each successive pair and the recording of the
videos from
the viewing locations are performed by recording a single video, and wherein
the
method further comprises using sensor data obtained during the recording of
the
single video to determine separation points between each of the videos from
the
viewing locations and the additional video for each successive pair.
34. The computer-implemented method of claim 31 wherein the recording
of the video of the interior from one of the viewing locations in the sequence
includes
78
Date Recue/Date Received 2020-11-06

automatically determining, by the mobile device, to terminate the recording of
the
video based at least in part on sensor information recorded by the mobile
device.
35. The computer-implemented method of claim 31 wherein the recording
of the video of the interior from one of the viewing locations in the sequence
includes
providing one or more notifications to the user during the recording of the
video
regarding at least one of movement of the mobile device as the user turns at
the
one viewing location, one or more environmental factors negatively affecting
the
recording of the video, or that the one viewing location has been adequately
recorded during a full rotation of the mobile device.
36. The computer-implemented method of claim 31 wherein the recording
of the video of the interior from one of the viewing locations in the sequence
includes
automatically modifying, by the mobile device and during the recording, one or
more
imaging parameters used for the recording based at least in part on analyzing
at
least some of the recorded video.
37. The computer-implemented method of claim 31 wherein creating the
panorama image for one of the viewing locations includes determining to
discard at
least some of the recorded video of the interior from the one viewing
location.
38. The computer-implemented method of claim 31 wherein the recording
of the video of the interior from one of the viewing locations in the sequence
includes
recording the video during multiple rotations of the mobile device using a
distinct set
of imaging parameters for each of the multiple rotations, and wherein the
creating
of the panorama image for one viewing location includes integrating at least
some
video information captured using the multiple distinct sets of imaging
parameters
during the multiple rotations.
79
Date Recue/Date Received 2020-11-06

39. The computer-implemented method of claim 31 wherein creating the
panorama image for one of the viewing locations includes determining to use a
first
component image of the panorama image as an initial image, and wherein the
providing of the information about the interior of the building includes
initiating a
display of the panorama image for the one viewing location by displaying the
determined initial image.
40. The computer-implemented method of claim 31 wherein the providing
of the information includes receiving an indication from the client device of
a
selection by the end-user of one of the included links during a display of a
first of
the created panorama images, and causing, based at least in part on the
selection
of the one included link, the client device to display a distinct second of
the created
panorama images that is associated with the one included link.
41. The computer-implemented method of claim 31 wherein the at least
one device is the mobile device, such that the creating of the panorama
images, the
generating of the relative positional information, and the providing of the
information
are performed at least in part by the mobile device.
42. The computer-implemented method of claim 31 wherein the at least
one device includes one or more server computing devices that are located
remotely
from the building and that receive at least some of the recorded data
representing
the sequence of viewing locations from the mobile device via one or more
intervening computer networks, such that one or more of the creating of the
panorama images, the generating of the relative positional information, and
the
providing of the information are performed at least in part by the one or more
server
computing devices.
Date Recue/Date Received 2020-11-06

43. A non-transitory computer-readable medium having stored contents
that cause one or more devices to perform automated operations, the automated
operations including:
recording, by at least one device of the one or more devices, data for a
sequence of viewing locations within an interior of a structure, including
recording
visual data of the interior from each of the viewing locations, and including
capturing
linking information that includes acceleration data associated with movement
between successive viewing locations in the sequence;
creating, by at least one device of the one or more devices and from the
recorded visual data for the viewing locations, panorama images that include a

panorama image for each viewing location with views from the viewing location
for
each of multiple directions;
generating, by at least one device of the one or more devices and based at
least in part on the acceleration data of the captured linking information,
relative
positional information between at least some of the viewing locations; and
providing, by at least one device of the one or more devices, information for
display about the interior of the structure that includes the created panorama
images
and that includes links for the created panorama images based on the generated

relative positional information.
44. The non-transitory computer-readable medium of claim 43 wherein
the structure is a building, wherein, for two successive viewing locations in
the
sequence of viewing locations, the capturing of the linking information
includes
recording additional visual data as the user moves between the two successive
viewing locations, and wherein the generating of the relative positional
information
between the at least some of the viewing locations includes generating
relative
positional information between the two successive viewing locations based in
part
on analyzing the acceleration data of the captured linking information and on
the
recorded additional visual data.
81
Date Recue/Date Received 2020-11-06

45. The non-transitory computer-readable medium of claim 44 wherein
the providing of the information for display includes:
presenting, on a client device of an end user, the panorama image for a first
viewing location of the two successive viewing locations, including presenting
a
visual link to a second viewing location of the two successive viewing
locations that
is based on the generated relative positional information; and
presenting, on the client device and after selection of the visual link by the

end user, additional visual information that includes at least some of the
recorded
additional visual data and that includes the panorama image for the second
viewing
location.
46. The non-transitory computer-readable medium of claim 43 wherein
the one or more devices include a mobile device with a camera and with one or
more inertial measurement units (IMUs), wherein the recording of the visual
data of
the interior from one of the viewing locations in the sequence includes
recording the
visual data during multiple rotations of the mobile device at different
vertical viewing
angles relative to a vertical axis of rotation at the one viewing location,
wherein the
creating of the panorama image for one viewing location includes integrating
visual
data recorded at the different vertical viewing angles, and wherein the
capturing of
the linking information associated with movement between successive viewing
locations includes recording the acceleration data from the one or more IMUs
as the
user carries the mobile device between those successive viewing locations.
47. The non-transitory computer-readable medium of claim 43 wherein
the one or more devices include a mobile computing device associated with a
user
that participates in the recording of the data for the sequence of viewing
locations.
48. The non-transitory computer-readable medium of claim 47 wherein
the one or more devices further include a remote server device located
externally to
the structure that performs at least some of the creating of the panorama
images,
82
Date Recue/Date Received 2020-11-06

the generating of the relative positional information, and the providing of
the
information.
49. The non-transitory computer-readable medium of claim 43 wherein
the stored contents include software instructions that, when executed by the
one or
more devices, program the one or more devices to perform the automated
operations.
50. A system, comprising:
one or more hardware processors of one or more computing devices; and
one or more memories with stored instructions that, when executed by at
least one of the one or more hardware processors, cause the one or more
computing devices to provide data representing an interior of a building,
including:
recording data for a sequence of recording locations associated with
a building, including recording data representing an interior of the building
from each
of at least some of the recording locations, and capturing linking information
that
includes acceleration data associated with movement between successive
recording locations in the sequence;
creating a representation of each recording location using the
recorded data from the recording location;
generating, based at least in part on the acceleration data of the
captured linking information, relative positional information between at least
some
of the recording locations; and
providing information for presentation about the building that includes
the created representations and that includes links for the created
representations
based on the generated relative positional information.
51. The system of claim 50 wherein, for two successive recording
locations in the sequence of recording locations, the recording of data for
each of
the successive recording locations includes recording video from a camera of a
83
Date Recue/Date Received 2020-11-06

smart phone device that is carried by a user and that is one of the one or
more
computing devices and that includes one or more inertial measurement units
(IMUs),
the capturing of the linking information between the two successive recording
locations further includes recording the acceleration data from the one or
more IMUs
as the user carries the smart phone device between the two successive
recording
locations and includes capturing visual data from the camera of the smart
phone
device that is associated with the movement between the two successive
recording
locations, the creating of the representation of each of the successive
recording
locations includes creating a panorama image for each of the successive
recording
locations, and the providing of the information includes:
presenting, on a client device of an end user, the panorama image for a first
recording location of the two successive recording locations, including
presenting a
visual link to a second recording location of the two successive recording
locations
that is based on the generated relative positional information; and
presenting, on the client device and after selection of the visual link,
additional visual information that includes the panorama image for the second
recording location.
52. The system of claim 50 wherein the recording of the data for the
sequence of recording locations further includes recording data for one or
more
recording locations external to the building to include data for at least some
of an
exterior of the building.
53. The system of claim 50 wherein the recording of the data for the
sequence of recording locations includes, for each of the at least some
recording
locations, capturing visual data at the recording location that includes
multiple
images captured in multiple directions from the recording location.
84
Date Recue/Date Received 2020-11-06

54. The system of claim 50 wherein the recording of the data for the
sequence of recording locations includes, for each of the at least some
recording
locations, capturing audio data at the recording location.
55. A computer-implemented method comprising:
generating, by a mobile device carried by a user, video data representing
multiple locations for a house, including:
recording a first video of an interior of the house from a first viewing
location within the house that is captured as the user performs at least a
full rotation
at the first viewing location;
capturing, as the user carries the mobile device from the first viewing
location to a second viewing location within the house, and via one or more
inertial
measurement units (IMUs) of the mobile device, linking information between the
first
and second viewing locations that includes motion data associated with
movement
of the mobile device, the motion data including acceleration data regarding
the
movement of the mobile device; and
recording a second video of the interior of the house from the second
viewing location that is captured as the user performs at least a full
rotation at the
second viewing location;
creating, from the first and second videos, a first panorama image for the
first
viewing location with a 360-degree view at the first viewing location, and a
second
panorama image for the second viewing location with a 360-degree view at the
second viewing location;
automatically determining, by one or more computing devices and based at
least in part on analyzing the motion data included in the captured linking
information, relative positional information between the first and second
viewing
locations; and
presenting, using the generated first and second panorama images, a display
of the interior of the house on a client device to a second user that includes
the
created first and second panorama images, including displaying a first user-
selectable link in the first panorama image in a direction of the second
panorama
Date Recue/Date Received 2020-11-06

image that is generated from the automatically determined relative positional
information, and displaying a second user-selectable link in the second
panorama
image in a direction of the first panorama image that is generated from the
automatically determined relative positional information, to allow the second
user to
change from a display of the first panorama image to a display of the second
panorama image upon selection of the first user-selectable link, and to allow
the
second user to change from the display of the second panorama image to the
display of the first panorama image upon selection of the second user-
selectable
link.
56.
The computer-implemented method of claim 55 wherein the capturing
of the linking information between the first and second viewing locations
includes
recording a third video as the user carries the mobile device from the first
viewing
location to the second viewing location, and wherein at least one of the
presenting
of the display of the interior of the house or the automatic determining of
the relative
positional information is based in part on the recorded third video.
57. The computer-implemented method of claim 55 wherein the
generating of the video data representing the multiple locations further
includes, for
each of multiple additional viewing locations that are part of a sequence from
the
first viewing location to a final viewing location, and beginning with the
second
viewing location as a current viewing location in the sequence:
capturing, as the user carries the mobile device from the current viewing
location in the sequence to a next viewing location in the sequence,
additional
linking information between the current and next viewing locations that
includes
acceleration data associated with movement of the mobile device; and
recording an additional video from the next viewing location that is captured
as the user performs at least a full rotation around a vertical axis at the
next viewing
location, such that the next viewing location becomes a new current viewing
location
in the sequence,
86
Date Recue/Date Received 2020-11-06

and wherein the creating includes creating additional panorama images for
the multiple additional viewing locations,
wherein the automatic determining of the relative positional information
includes determining relative positional information between each pair of
viewing
locations in the sequence, and
wherein the presenting includes presenting the additional panorama images
including additional displayed user-selectable links between at least some of
the
additional panorama images.
58. The computer-implemented method of claim 57 wherein the recording
of the additional video for each of the multiple additional viewing locations
includes
recording audio data, wherein one or more of the additional viewing locations
are
outside of the house, and wherein the additional video recorded at each of the
one
or more additional viewing locations includes a portion of an exterior of the
house.
59. The computer-implemented method of claim 55 wherein the mobile
device is a smart phone that is one of the one or more computing devices, and
wherein the creating, the automatic determining and providing of information
for the
presenting are performed by the mobile device.
60. The computer-implemented method of claim 55 wherein the one of the
one or more computing devices include one or more server computing devices
that
are at a location outside of the house and that receive the generated video
data via
one or more transmissions from the mobile device, and wherein the creating,
the
automatic determining and providing of information for the presenting are
performed
at least in part by the one or more server computing devices.
61. A computer-implemented method comprising:
obtaining, by at least one device, linking information from movement between
a sequence of viewing locations within an interior of a building, wherein the
linking
87
Date Recue/Date Received 2020-11-06

information includes video information generated by a mobile device as a user
carries the mobile device between each successive pair of viewing locations in
the
sequence;
determining, by the at least one device, and for each successive pair of
viewing locations in the sequence, relative positional information that
includes at
least a direction from a starting viewing location of the successive pair to
an ending
viewing location of the successive pair, including analyzing at least the
video
information of the linking information to model a travel path of the user from
the
starting viewing location of the successive pair to the ending viewing
location of the
successive pair, and using the modeled travel path as part of determining the
direction;
generating, by the at least one device, and using the determined direction for

each successive pair of viewing locations in the sequence, information to link

panorama images for the viewing locations, wherein each of the panorama images

is associated with one of the viewing locations and has views from the one
viewing
location in each of multiple directions, and wherein the generating includes,
for each
of the viewing locations other than a last viewing location in the sequence,
generating an inter-panorama link for the panorama image associated with the
viewing location that points toward a next viewing location in the sequence;
and
providing, by the at least one device and for display on a client device,
information about the interior of the building that includes the panorama
images and
that includes the generated inter-panorama links for the panorama images, to
enable changing between displays of two of the panorama images upon selection
of one of the generated inter-panorama links that is associated with the two
panorama images.
62. The computer-implemented method of claim 61 wherein the
determining of the relative positional information for one of the successive
pairs of
viewing locations further includes determining a departure direction from the
starting
viewing location of the one successive pair by analyzing a first portion of
obtained
video information that corresponds to that starting viewing location, and
determining
88
Date Recue/Date Received 2020-11-06

an arrival direction at the ending viewing location of the one successive pair
by
analyzing a second portion of obtained video information that corresponds to
that
ending viewing location, and wherein modeling of the travel path for the one
successive pair is further based in part on the determined departure direction
and
on the determined arrival direction.
63. The computer-implemented method of claim 62 wherein the
determining of the relative positional information for the one successive pair
further
includes determining a second direction from the ending viewing location of
the one
successive pair to the starting viewing location of the one successive pair,
and
wherein the generating of the information further includes using the
determined
second direction to generate an inter-panorama link that points toward the
panorama image associated with the starting viewing location of the one
successive
pair from the panorama image for the ending viewing location of the one
successive
pair.
64. The computer-implemented method of claim 61 wherein the
determining of the relative positional information for one of the successive
pairs of
viewing locations further includes identifying first image subsets of the
panorama
image for the starting viewing location of the one successive pair that have
features
matching second image subsets of the panorama image for the ending viewing
location of the one successive pair, and using information from the first
image
subsets and the second image subsets as part of determining the direction from
the
starting viewing location of the one successive pair to the ending viewing
location of
the one successive pair.
65.
The computer-implemented method of claim 64 wherein the using of
the information from the first image subsets and the second image subsets as
part
of determining the direction for the one successive pair further includes
generating
an essential matrix for each of multiple pairs of image subsets each having
one of
89
Date Recue/Date Received 2020-11-06

the first image subsets and one of the second image subsets, determining
values
from the essential matrices for the multiple pairs that correspond to the
direction for
the one successive pair, and generating an aggregate value for the direction
for the
one successive pair from the values determined for the multiple pairs.
66. The computer-implemented method of claim 64 wherein the using of
the information from the first image subsets and the second image subsets as
part
of determining the direction for the one successive pair further includes
generating
a homography matrix for each of multiple pairs of image subsets each having
one
of the first image subsets and one of the second image subsets, determining
values
from the homography matrices for the multiple pairs that correspond to the
direction
for the one successive pair, and generating an aggregate value for the
direction for
the one successive pair from the values determined for the multiple pairs.
67. The computer-implemented method of claim 64 wherein the using of
the information from the first image subsets and the second image subsets as
part
of determining the direction for the one successive pair further includes
determining,
for each of multiple pairs of image subsets each having a first image subset
and a
second image subset, whether to use an essential matrix or a homography matrix

to determine a value for the direction for the one successive pair.
68. The computer-implemented method of claim 64 wherein the using of
the information from the first image subsets and the second image subsets as
part
of determining the direction for the one successive pair further includes
determining,
for each of multiple pairs of image subsets each having one of the first image

subsets and one of the second image subsets, and by using reprojection, that a

degree of match between the image subsets of the pair exceeds a defined
threshold
before further using the pair to determine a value for the direction for the
one
successive pair.
Date Recue/Date Received 2020-11-06

69. The computer-implemented method of claim 64 wherein the using of
the information from the first image subsets and the second image subsets as
part
of determining the direction for the one successive pair further includes
generating
a confidence value based on the information from the first image subsets and
the
second image subsets, and using the determined direction from the starting
viewing
location of the one successive pair to the ending viewing location of the one
successive pair based at least in part on the generated confidence value
exceeding
a defined threshold.
70. The computer-implemented method of claim 69 wherein the
generating of the confidence value is based at least in part on a number of
peaks in
an aggregate consensus distribution generated using multiple pairs of image
subsets each having one of the first image subsets and one of the second image

subsets, and on circular coverage of samples in at least one of the panorama
image
for the starting viewing location of the one successive pair or the panorama
image
for the ending viewing location of the one successive pair.
71. The computer-implemented method of claim 61 wherein the
generating of the information to link the panorama images for the viewing
locations
further includes modifying one or more of the directions from the starting
viewing
locations of each successive pair to the ending viewing locations of each
successive
pair by performing a global optimization on the directions, and wherein the
generating of the one inter-panorama link is based on at least one of the
modified
directions.
72. The computer-implemented method of claim 61 wherein the at least
one device includes the mobile device and the mobile device performs at least
in
part the determining and the generating and the providing and/or the at least
one
device includes one or more server computing devices that are located remotely

from the building and are in communication over one or more intervening
computer
91
Date Recue/Date Received 2020-11-06

networks with the mobile device and that perform at least in part one or more
of the
determining and the generating and the providing, and wherein the method
further
com prises:
recording, by the mobile device, videos of the interior from each of the
viewing locations as the user turns at the viewing locations; and
creating, by the at least one device and from the recorded videos, the
panorama images for the viewing locations.
73. The computer-implemented method of claim 61 wherein the providing
of the information includes receiving an indication from the client device of
a user
selection of the one generated inter-panorama link during a display of one of
the
two panorama images, and causing, based at least in part on the user selection
of
the one generated inter-panorama link, the client device to display the other
of the
two panorama images.
74. A non-transitory computer-readable medium having stored contents
that cause one or more computing devices to perform automated operations, the
automated operations including at least:
obtaining linking information for a sequence of viewing locations associated
with a structure that include at least one viewing location within an interior
of the
structure, wherein the linking information includes data that is generated as
an
acquisition device moves along a travel path between the viewing locations and
that
includes one or more images captured between each pair of successive viewing
locations in the sequence, wherein the acquisition device is a mobile
computing
device;
determining, by the one or more computing devices, and for one or more of
the pairs of successive viewing locations in the sequence, relative positional

information that includes at least a direction from a starting viewing
location of the
pair to an ending viewing location of the pair, including analyzing the
linking
information to model a portion of the travel path from the starting viewing
location to
92
Date Recue/Date Received 2020-11-06

the ending viewing location and using the modeled portion of the travel path
as part
of determining the direction;
generating, by the one or more computing devices, and using the determined
direction for each of the one or more pairs of successive viewing locations,
information to link the starting viewing location for the pair to the ending
viewing
location for the pair, including generating a link for visual information
associated with
the starting viewing location for the pair that points toward the ending
viewing
location for the pair, wherein the visual information associated with the
starting
viewing location of the indicated pair is a first panorama image from that
starting
viewing location, and wherein visual information associated with the ending
viewing
location of the indicated pair is a second panorama image from that ending
viewing
location; and
providing, by the one or more computing devices, information for display
about the structure that includes the visual information associated with the
starting
viewing location of an indicated pair of the one or more pairs, and the
generated link
for the indicated pair, wherein the providing of the information includes
providing the
visual information associated with the indicated pair and enables an end user
to
select a displayed representation of the generated link to change between a
display
of the first panorama image to a display of the second panorama image, and
wherein the stored contents include software instructions that, when executed
by
the one or more devices, program the one or more computing devices to perform
the automated operations.
75. A system, comprising:
one or more hardware processors of one or more computing devices; and
one or more memories with stored instructions that, when executed by at
least one of the one or more hardware processors, cause the one or more
computing devices to provide data representing a building, including:
obtaining data for multiple recording locations associated with the
building, including data that is recorded at each of the recording locations,
and
93
Date Recue/Date Received 2020-11-06

including visual linking information captured during movement between at least

some of the recording locations;
determining, for one or more pairs of recording locations, and based
at least in part on analyzing the visual linking information, a direction from
a starting
recording location of the pair to an ending recording location of the pair,
wherein the
one or more pairs of recording locations include a first pair whose starting
recording
location has recorded data that includes a plurality of first images recorded
at a first
of one or more recording locations and whose ending location has recorded data

that includes a plurality of second images recorded at a second of the one or
more
recording locations, and wherein visual linking information between the
starting and
ending recording locations of the first pair includes one or more additional
images
captured during movement;
creating a first panorama image for the starting recording location for
the first pair from the plurality of first images, and creating a second
panorama
image for the ending recording location for the first pair from the plurality
of second
images;
generating, for each of the one or more pairs of recording locations
and using the determined direction for the pair, information to link the data
recorded
at the starting recording location for the pair to the data recorded for the
ending
recording location for the pair, including generating a link that points
between the
starting recording location for the pair and the ending recording location for
the pair;
and
providing information for presentation about the building that includes
data recorded at the one or more recording locations and that includes at
least one
of the generated links, including:
presenting, on a client device of an end user, the first panorama
image for the starting recording location of the first pair, and a visual
representation
of the generated link between the starting and ending recording locations for
the first
pair; and
94
Date Recue/Date Received 2020-11-06

presenting, on the client device and after selection of the visual
representation of the generated link, additional visual information that
includes the
second panorama image for the ending recording location of the first pair.
76. The system of claim 75 wherein the one or more recording locations
are external to the building, and wherein the obtaining of the data for the
multiple
recording locations further includes recording data at each of the one or more

recording locations corresponding to at least some of an exterior of the
building.
77. The system of claim 76 wherein the one or more recording locations
further include a sequence of two or more recording locations within an
interior of
the building, and wherein one or more images are captured at each of the two
or
more recording locations.
78. The system of claim 75 wherein the obtaining of the data for the
multiple recording locations includes, for each of the one or more recording
locations, capturing audio data at the recording location.
79. A computer-implemented method comprising:
obtaining, by at least one device, multiple panorama images for a sequence
of multiple viewing locations within an interior of a building, wherein the
multiple
panorama images include, for each of the multiple viewing locations, a
respective
one of the multiple panorama images captured at that viewing location and
associated with that viewing location;
determining, by the at least one device, and for each successive pair of
viewing locations in the sequence, relative positional information that
includes at
least a determined direction from a starting viewing location of the
successive pair
to an ending viewing location of the successive pair, including analyzing
first visual
information included in the panorama image associated with the starting
viewing
location and second visual information included in the panorama image
associated
Date Recue/Date Received 2020-11-06

with the ending viewing location to identify matching features in the first
and second
visual information and to use positions in the first and second visual
information of
the identified matching features as part of the determining of the relative
positional
information;
generating, by the at least one device, and using the determined direction for

each successive pair of viewing locations in the sequence, information to link
the
multiple panorama images for the multiple viewing locations, including
generating,
for each of the viewing locations other than a last viewing location in the
sequence,
an inter-panorama link for the panorama image associated with that viewing
location
that points toward a next viewing location in the sequence; and
providing, by the at least one device and for presentation on a client device,

information about the interior of the building that includes the multiple
panorama
images and that includes the generated inter-panorama links for the multiple
panorama images, to enable changing between displays of two of the multiple
panorama images upon selection of one of the generated inter-panorama links
that
is associated with the two panorama images.
80. The computer-implemented method of claim 79 wherein the
determining of the relative positional information for one of the successive
pairs of
viewing locations further includes:
identifying first image subsets of the panorama image associated with the
starting viewing location of the one successive pair that include the matching

features, and identifying second image subsets of the panorama image
associated
with the ending viewing location of the one successive pair that include the
matching
features; and
determining the direction for the one successive pair by:
generating an essential matrix for each of multiple first pairs of image
subsets each having one of the first image subsets and one of the second image

subsets, determining values from the essential matrices for the multiple first
pairs
that correspond to the direction for the one successive pair, generating an
aggregate
first value for the direction for the one successive pair from the values
determined
96
Date Recue/Date Received 2020-11-06

for the multiple first pairs, and using the generated aggregated first value
as part of
the determining of the direction for the one successive pair; and/or
generating a homography matrix for each of multiple second pairs of
image subsets each having one of the first image subsets and one of the second

image subsets, determining values from the homography matrices for the
multiple
second pairs that correspond to the direction for the one successive pair,
generating
an aggregate second value for the direction for the one successive pair from
the
values determined for the multiple second pairs, and using the generated
aggregated second value as part of the determining of the direction for the
one
successive pair.
81. The computer-implemented method of claim 79 wherein the
determining of the relative positional information for one of the successive
pairs of
viewing locations further includes determining a relative rotation between the

panorama image associated with the starting viewing location of the one
successive
pair and the panorama image associated with the ending viewing location of the
one
successive pair, and using the determined relative rotation as part of
determining
the direction from the starting viewing location of the one successive pair to
the
ending viewing location of the one successive pair.
82.
A non-transitory computer-readable medium having stored contents
that cause one or more computing devices to perform automated operations
including at least:
obtaining multiple images for multiple viewing locations associated with a
structure that include one or more viewing locations within an interior of the

structure, wherein the multiple images include, for each of the multiple
viewing
locations, an image associated with that viewing location to show one or more
views
from that viewing location;
determining, by the one or more computing devices, and for each of one or
more pairs of viewing locations from the multiple viewing locations, relative
positional information that includes at least a determined direction from a
first
97
Date Recue/Date Received 2020-11-06

viewing location of the pair to a second viewing location of the pair,
including
analyzing first visual information included in the image associated with the
first
viewing location and second visual information included in the image
associated
with the second viewing location to identify matching features in the first
and second
visual information and to use positions in the first and second visual
information of
the identified matching features as part of the determining of the relative
positional
information for the pair;
generating, by the one or more computing devices, and using the determined
direction for each of the one or more pairs of viewing locations, information
to link
the first viewing location for the pair to the second viewing location for the
pair,
including generating a link that points from the first viewing location for
the pair
toward the second viewing location for the pair; and
providing, by the one or more computing devices, information for
presentation about the structure that includes at least some of the first
visual
information from the image associated with the first viewing location of an
indicated
pair of the one or more pairs, and that includes the generated link pointing
from the
first viewing location for the indicated pair toward the second viewing
location for
the indicated pair.
83.
The non-transitory computer-readable medium of claim 82 wherein
the image associated with the first viewing location of each of the one or
more pairs
is a first panorama image from that first viewing location, wherein the image
associated with the second viewing location of each of the one or more pairs
is a
second panorama image from that second viewing location, and wherein the
providing of the information includes initiating presentation of at least some
of the
first panorama image for the indicated pair and enabling an end user to select
a
displayed visual representation of the generated link for the indicated pair
to change
to a presentation to at least some of the second panorama image for the
indicated
pair, and wherein the stored contents include software instructions that, when

executed by the one or more computing devices, program the one or more
computing devices to perform the automated operations.
98
Date Recue/Date Received 2020-11-06

84. The non-transitory computer-readable medium of claim 83 wherein
the structure is a building, wherein the multiple viewing locations include a
sequence
of viewing locations within an interior of the building, and wherein the
determining
and the generating are performed for each successive pair of viewing locations
in
the sequence.
85. The non-transitory computer-readable medium of claim 83 wherein
the determining of the relative positional information for one of the pairs of
viewing
locations further includes identifying first image subsets of the panorama
image
associated with the first viewing location of the one pair that include the
matching
features and identifying second image subsets of the panorama image associated

with the second viewing location of the one pair that include the matching
features,
and using the first image subsets and the second image subsets as part of
determining the direction from the first viewing location of the one pair to
the second
viewing location of the one pair.
86. The non-transitory computer-readable medium of claim 85 wherein
the using of the first image subsets and the second image subsets as part of
determining the direction for the one pair further includes generating an
essential
matrix for each of multiple pairs of image subsets each having one of the
first image
subsets and one of the second image subsets, determining values from the
essential matrices for the multiple pairs that correspond to the direction for
the one
pair, and generating an aggregate value for the direction for the one pair
from the
values determined for the multiple pairs.
87. The non-transitory computer-readable medium of claim 85 wherein
the using of the first image subsets and the second image subsets as part of
determining the direction for the one pair further includes generating a
homography
matrix for each of multiple pairs of image subsets each having one of the
first image
subsets and one of the second image subsets, determining values from the
99
Date Recue/Date Received 2020-11-06

homography matrices for the multiple pairs that correspond to the direction
for the
one pair, and generating an aggregate value for the direction for the one pair
from
the values determined for the multiple pairs.
88. The non-transitory computer-readable medium of claim 85 wherein
the using of the first image subsets and the second image subsets as part of
determining the direction for the one pair further includes determining, for
each of
multiple pairs of image subsets each having a first image subset and a second
image subset, whether to use an essential matrix or a homography matrix to
determine a value for the direction for the one pair.
89. The non-transitory computer-readable medium of claim 85 wherein
the using of the first image subsets and the second image subsets as part of
determining the direction for the one pair further includes determining, for
each of
multiple pairs of image subsets each having one of the first image subsets and
one
of the second image subsets, and by using reprojection, that a degree of match

between the image subsets of the pair exceeds a defined threshold before
further
using the pair to determine a value for the direction for the one pair.
90. The non-transitory computer-readable medium of claim 85 wherein
the using of the first image subsets and the second image subsets as part of
determining the direction for the one pair further includes generating a
confidence
value based on information from the first image subsets and the second image
subsets, and using the determined direction from the first viewing location of
the one
pair to the second viewing location of the one pair based at least in part on
the
generated confidence value exceeding a defined threshold.
91. The non-transitory computer-readable medium of claim 90 wherein
the generating of the confidence value is based at least in part on a number
of peaks
in an aggregate consensus distribution generated using multiple pairs of image
100
Date Recue/Date Received 2020-11-06

subsets each having one of the first image subsets and one of the second image

subsets, and/or on circular coverage of samples in at least one of the
panorama
image associated with the first viewing location of the one pair or the
panorama
image associated with the second viewing location of the one pair.
92. The non-transitory computer-readable medium of claim 83 wherein
the determining of the relative positional information for one of the pairs of
viewing
locations further includes determining one or more angles within the panorama
image associated with the first viewing location of the one pair that point
toward the
second viewing location of the one pair, wherein the generating of the link
for the
panorama image associated with the first viewing location of the one pair
includes
generating an inter-panorama link from the panorama image associated with the
first viewing location of the one pair to the panorama image associated with
the
second viewing location of the one pair, and includes associating that inter-
panorama link with the determined one or more angles within the panorama image

associated with the first viewing location of the one pair, and wherein the
providing
of the information includes initiating presentation of the panorama image
associated
with the first viewing location of the one pair and displaying a visual
representation
of the generated inter-panorama link that is overlaid on the presented
panorama
image at a position corresponding to the determined one or more angles within
the
presented panorama image.
93. The non-transitory computer-readable medium of claim 82 wherein
the determining of the relative positional information for one of the pairs of
viewing
locations further includes determining a relative rotation between the image
associated with the first viewing location of the one pair and the image
associated
with the second viewing location of the one pair, and using the determined
relative
rotation as part of determining the direction from the first viewing location
of the one
pair to the second viewing location of the one pair.
101
Date Recue/Date Received 2020-11-06

94. The non-transitory computer-readable medium of claim 82 wherein
the determining of the relative positional information for one of the pairs of
viewing
locations further includes determining a relative translation between the
first viewing
location of the one pair and the second viewing location of the one pair, and
using
the determined relative translation as part of determining the relative
positional
information for the one pair of viewing locations.
95. The non-transitory computer-readable medium of claim 82 wherein
the matching features identified in first and second visual information of
images
associated with the indicated pair of viewing locations include structural
features in
at least one room of the building, and wherein the determining of the relative

positional information for the indicated pair includes identifying locations
of those
structural features in the images associated with the viewing locations of the

indicated pair.
96. The non-transitory computer-readable medium of claim 95 wherein
the structural features in the at least one room include one or more
structural
features that each are part of at least one of a window, or a doorway, or a
corner,
or a wall.
97. The non-transitory computer-readable medium of claim 96 wherein
the contents of the at least one room include one or more features that each
are
part of at least one of a piece of furniture, or a moveable object.
98. The non-transitory computer-readable medium of claim 82 wherein
the matching features identified in first and second visual information of
images
associated with the indicated pair of viewing locations include contents of at
least
one room of the building, and wherein the determining of the relative
positional
information for the indicated pair includes identifying locations of those
contents in
the images associated with the indicated pair.
102
Date Recue/Date Received 2020-11-06

99.
The non-transitory computer-readable medium of claim 82 wherein
the matching features identified in first and second visual information of
images
associated with the indicated pair of viewing locations include one or more
visible
three-dimensional points in at least one room of the building, and wherein the

determining of the relative positional information for the indicated pair
includes
identifying locations of those visible three-dimensional points in the images
associated with the viewing locations of the indicated pair.
100. The non-transitory computer-readable medium of claim 82 wherein
the matching features identified in first and second visual information of
images
associated with the indicated pair of viewing locations include one or more
planar
surfaces, and wherein the determining of the relative positional information
for the
indicated pair includes identifying locations of those planar surfaces in the
images
associated with the viewing locations of the indicated pair.
101. The non-transitory computer-readable medium of claim 82 wherein
the determining of the relative positional information for the indicated pair
of viewing
locations further includes determining a second direction from the second
viewing
location of the indicated pair to the first viewing location of the indicated
pair, and
wherein the generating of the information further includes using the
determined
second direction to generate an inter-panorama link that points toward the
image
associated with the first viewing location of the indicated pair from the
image
associated with the second viewing location of the indicated pair.
102. The non-transitory computer-readable medium of claim 82 wherein
the one or more pairs of viewing locations include multiple pairs of viewing
locations
that in aggregate include all of the multiple viewing locations, wherein the
generating
of the information further includes generating information to link the
multiple images
for the multiple viewing locations by modifying one or more of the determined
directions from the first viewing locations of each of the multiple pairs to
the second
103
Date Recue/Date Received 2020-11-06

viewing locations of each of the multiple pairs by performing a global
optimization
on the determined directions, and wherein the generating of the link is based
on at
least one of the modified directions.
103. The non-transitory computer-readable medium of claim 82 wherein
the one or more computing devices include a mobile device that participates in

capturing visual information for the multiple images and that performs at
least in part
the determining and the generating and the providing, and/or the one or more
computing devices include one or more server computing devices that are
located
remotely from the structure and are in communication over one or more
intervening
computer networks with a device at the structure and that perform at least in
part
one or more of the determining and the generating and the providing, and
wherein
the automated operations further include creating, by the one or more
computing
devices and from captured visual information at the structure, the multiple
images
for the multiple viewing locations.
104. The non-transitory computer-readable medium of claim 82 wherein
the providing of the information further includes transmitting the provided
information to a client device for presentation to a user, and receiving an
indication
from the client device of a user selection by the user of the generated link
during
presentation of the first visual information from the image associated with
the first
viewing location for the indicated pair, and causing, based at least in part
on the
user selection of the generated link, the client device to present at least
some of the
second visual information from the image associated with the second viewing
location of the indicated pair.
105. The non-transitory computer-readable medium of claim 82 wherein
the multiple viewing locations include an external viewing location that is
external to
the structure and whose associated image includes visual data of at least some
of
an exterior of the structure, and wherein the external viewing location and
one of
104
Date Recue/Date Received 2020-11-06

the one or more viewing locations within the interior of the structure are one
of the
one or more pairs of viewing locations, such that the determining of the
relative
positional information for that pair is based at least in part on one or more
matching
features being visible in both the associated image of the external viewing
location
and the associated image of the one viewing location within the interior of
the
structure.
106. The non-transitory computer-readable medium of claim 82 wherein
the multiple viewing locations include a first viewing location in a first
room of the
structure and further include a second viewing location in a second room of
the
structure, wherein the first and second viewing locations are not fully
visible to each
other, and wherein the first and second viewing locations are one of the one
or more
pairs of viewing locations, such that the determining of the relative
positional
information for that pair is based at least in part on one or more matching
features
being visible in the associated images for both of the first and second
viewing
locations.
107. A system, comprising:
one or more hardware processors of one or more computing devices; and
one or more memories with stored instructions that, when executed by at
least one of the one or more hardware processors, cause the one or more
computing devices to provide data representing a building, including:
obtaining multiple data groups for multiple recording locations
associated with the building, wherein the multiple data groups include, for
each of
the multiple recording locations, a respective one of the multiple data groups
that is
recorded at that recording location and associated with that recording
location;
determining, for each of one or more pairs of recording locations from
the multiple recording locations, a direction from a starting recording
location of the
pair to an ending recording location of the pair, including analyzing first
information
included in the data group associated with the starting recording location and

second information included in the data group associated with the ending
recording
105
Date Recue/Date Received 2020-11-06

location to identify matching features, wherein the one or more pairs of
recording
locations include at least one pair of recording locations in which the
recorded data
group for the starting recording location of the at least one pair includes
one or more
first images and in which the recorded data group for the ending recording
location
of the at least one pair includes one or more second images, and wherein the
first
and second images include visual information that is analyzed during the
determining of the direction for the at least one pair of recording locations;
creating a first panorama image for the starting recording location for
the at least one pair from the one or more first images, and creating a second

panorama image for the ending recording location for the at least one pair
from the
one or more second images;
generating, for each of the one or more pairs of recording locations
and using the determined direction for the pair, information to link the data
group
recorded at the starting recording location for the pair to the data group
recorded for
the ending recording location for the pair, including generating a link that
points
between the starting recording location for the pair and the ending recording
location
for the pair; and
providing information for presentation about the building that includes
at least some of one or more data groups recorded at one or more of the
recording
locations and that includes at least one of the generated links, including:
initiating presenting, on a client device of an end user, the first
panorama image for the starting recording location of the at least one pair,
and a
visual representation of the generated link between the starting and ending
recording locations for the at least one pair; and
initiating presenting, on the client device and after selection of the
visual representation of the generated link by a user, additional visual
information
that includes the second panorama image for the ending recording location of
the
at least one pair.
108. The system of claim 107 wherein the obtaining of the multiple data
groups for the multiple recording locations includes, for each of the one or
more
106
Date Recue/Date Received 2020-11-06

recording locations, capturing audio data at the recording location, and
wherein the
providing of the information further includes initiating presenting, on a
client device
of an end user, the audio data captured at one of the one or more recording
location,
and a representation of the at least one generated link associated with the
presented
audio data.
109. The system of claim 107 wherein the one or more recording locations
are external to the building, and wherein the obtaining of the multiple data
groups
for the multiple recording locations further includes recording data at each
of the
one or more recording locations corresponding to at least some of an exterior
of the
building.
110. The system of claim 109 wherein an interior recording location of the
multiple recording locations is inside the building and has a first image in
its data
group, wherein the external recording location has a second image in its data
group,
and wherein the external recording location and the interior recording
location are
one of the one or more pairs of recording locations, such that the determining
of the
direction from the starting recording location of that pair to the ending
recording
location of the pair is based at least in part on one or more matching
features being
visible in both the first and second images.
111. A computer-implemented method comprising:
recording data with one or more mobile devices for a sequence of viewing
locations within an interior of a building, including recording visual data of
the interior
from each of the viewing locations, and including capturing linking
information about
movement of the one or more mobile devices along a travel path as the one or
more
mobile devices move between successive viewing locations in the sequence;
creating, by one or more computing devices and from the recorded visual
data for the viewing locations, panorama images that include a panorama image
for
107
Date Recue/Date Received 2020-11-06

each viewing location with views from the viewing location for each of
multiple
directions;
generating, by the one or more computing devices and based at least in part
on the captured linking information, relative positional information between
at least
some of the viewing locations; and
providing, by the one or more computing devices, information for display
about the interior of the building that includes the created panorama images
and
that indicates at least some of the generated relative positional information
for the
at least some viewing locations, including:
receiving, after display on a client device of one of the created
panorama images that includes a user-selectable visual representation of a
link to
another of the created panorama images, an indication of a selection by a user
of
the client device of the user-selectable visual representation; and
causing, based at least in part on the selection, the client device to
display the another created panorama image.
112. The computer-implemented method of claim 111 wherein the
capturing of the linking information as the one or more mobile devices move
between successive viewing locations in the sequence includes recording visual

data along the travel path, and wherein the generating of the relative
positional
information between the at least some viewing locations includes analyzing the

recorded visual data.
113. The computer-implemented method of claim 112 wherein the
recording of the visual data along the travel path as the one or more mobile
devices
move between the successive viewing locations in the sequence includes
monitoring motion of the one or more mobile devices, and providing one or more

guidance cues to an associated user based on the monitoring.
108
Date Recue/Date Received 2020-11-06

114. The computer-implemented method of claim 111 wherein the
capturing of the linking information as the one or more mobile devices move
between successive viewing locations in the sequence includes recording
information about lighting conditions as the one or more mobile devices move
along
a travel path between the successive viewing locations in the sequence, and
wherein the providing of the information includes providing the recorded
information
about the lighting conditions.
115. The computer-implemented method of claim 111 wherein the
capturing of the linking information as the one or more mobile devices move
between successive viewing locations in the sequence includes recording
information about environmental conditions as the one or more mobile devices
move
along a travel path between the successive viewing locations in the sequence,
and
wherein the providing of the information includes providing the recorded
information
about the environmental conditions.
116. The computer-implemented method of claim 111 wherein the
capturing of the linking information as the one or more mobile devices move
between successive viewing locations in the sequence includes recording
descriptive information from an associated user as the one or more mobile
devices
move along a travel path between the successive viewing locations in the
sequence,
and wherein the providing of the information includes providing the recorded
descriptive information.
117. A non-transitory computer-readable medium with stored contents that
cause one or more computing devices to perform automated operations including
at least:
recording, using one or more mobile devices, data for a sequence of viewing
locations associated with multiple buildings, including recording visual data
from
each of the viewing locations and recording additional data other than visible
light
109
Date Recue/Date Received 2020-11-06

from an environment at each of the viewing locations, and including capturing
linking
information about movement of the one or more mobile devices along a travel
path
as the one or more mobile devices move between successive viewing locations in

the sequence, wherein one of the one or more mobile devices is a self-powered
mobile device that moves itself between successive viewing locations in the
sequence, wherein at least some of the linking information is captured by the
self-
powered mobile device as it moves itself between successive viewing locations
in
the sequence, and wherein at least one of the successive viewing locations in
the
sequence is selected based on providing to an associated user a notification
regarding use of the at least one viewing location after an automated
determination
to provide the notification;
creating, by the one or more computing devices and for each of the viewing
locations, a panorama image for the viewing location from the recorded visual
data
for the viewing location and a representation of the viewing location using
the
recorded additional data for the viewing location;
generating, by the one or more computing devices and based at least in part
on the captured linking information, relative positional information between
at least
some of the viewing locations; and
presenting, by the one or more computing devices, information about at least
one of the multiple buildings that includes at least some of the created
panorama
images and that indicates at least some of the generated relative positional
information for the at least some viewing locations and that includes at least
one of
the created representations using recorded additional data.
118. The non-transitory computer-readable medium of claim 117 wherein
the stored contents include software instructions that, when executed, cause
the
one or more computing devices to perform the automated operations, and wherein

the presenting of the information includes receiving, after display on a
client device
of the created panorama image for one of the viewing locations that includes a
user-
selectable visual representation of a link to the created panorama image of
another
of the viewing locations, an indication of a selection by a user of the client
device of
110
Date Recue/Date Received 2020-11-06

the user-selectable visual representation, and causing, based at least in part
on the
selection, the client device to display the created panorama image of the
another
viewing location and to present the created representation for the another
viewing
location.
119. A computer-implemented method comprising:
recording data with a mobile device for a sequence of recording locations
associated with a building, including recording, at each of the recording
locations,
at least one type of data other than visible light from an environment at that
recording
location, and capturing linking information associated with movement between
successive recording locations in the sequence;
creating, by one or more computing devices, a representation of each
recording location using the data recorded at the recording location;
generating, by the one or more computing devices and based at least in part
on the captured linking information, relative positional information between
at least
some of the recording locations; and
providing, by the one or more computing devices, information for
presentation about the building that includes at least some of the created
representations and that indicates at least some of the generated relative
positional
information for the at least some recording locations.
120. The computer-implemented method of claim 119 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording audio data from that
recording
location and recording visual data from that recording location, and wherein
the
providing of the information about the building includes presenting, by the
one or
more computing devices and on a client device of an end user, the created
representation for one of the recording locations by displaying the visual
data from
the one recording location and by playing the audio data from the one
recording
location.
111
Date Recue/Date Received 2020-11-06

121. The computer-implemented method of claim 120 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording, using one or more
microphones
at that recording location, data from a surrounding environment for that
recording
location.
122. The computer-implemented method of claim 120 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording, from a user associated with
the
mobile device, a verbal description of one or more aspects of that recording
location.
123. The computer-implemented method of claim 119 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording at least one of infrared
data or
ultraviolet data.
124. The computer-implemented method of claim 119 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording at least one of radiation
levels or
electromagnetic field levels.
125. The computer-implemented method of claim 119 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording information about lighting
conditions at that recording location.
126. The computer-implemented method of claim 119 wherein the
recording of the at least one type of data other than visual light at each of
the at
least some recording locations includes recording data from an altimeter
sensor at
that recording location.
112
Date Recue/Date Received 2020-11-06

127. The computer-implemented method of claim 119 wherein the
providing of the information includes receiving, after presentation on a
client device
of one of the created representations that includes a user-selectable link to
another
of the created representations, an indication of a selection by a user of the
client
device of the user-selectable link, and causing, based at least in part on the

selection, the client device to present the another created representation.
128. A system comprising:
one or more hardware processors of one or more computing devices; and
one or more memories with stored instructions that, when executed by at
least one of the one or more hardware processors, cause the one or more
computing devices to create and present information representing a building,
including:
obtaining data recorded with a mobile device for a sequence of
recording locations associated with a building, wherein the obtained data
includes,
for each of the recording locations, at least one type of data other than
visible light
that is recorded from an environment at that recording location, and further
includes
linking information that is captured during movement between successive
recording
locations in the sequence;
creating a representation of each recording location using the data
recorded from the environment at the recording location;
generating, based at least in part on the linking information, relative
positional information between at least some of the recording locations; and
providing information for presentation about the building that includes
at least some of the created representations and that indicates at least some
of the
generated relative positional information for the at least some recording
locations,
including:
receiving, after presentation on a client device of one of the
created representations that includes a user-selectable link to another of the
created
representations, an indication of a selection by a user of the client device
of the
user-selectable link; and
113
Date Recue/Date Received 2020-11-06

causing, based at least in part on the selection, the client device
to present the another created representation.
129. A computer-implemented method comprising:
recording data for a sequence of viewing locations within an interior of a
building, wherein the recording is performed at least in part by a self-
powered mobile
device that moves itself between successive viewing locations in the sequence,
and
wherein the recording includes recording visual data of the interior from each
of the
viewing locations and includes capturing linking information as the self-
powered
device moves itself between the successive viewing locations in the sequence;
creating, by one or more computing devices and from the recorded visual
data for the viewing locations, panorama images that include a panorama image
for
each viewing location with views from the viewing location for each of
multiple
directions;
generating, by the one or more computing devices and based at least in part
on the captured linking information, relative positional information between
at least
some of the viewing locations; and
providing, by the one or more computing devices, information for display
about the interior of the building that includes the created panorama images
and
that indicates at least some of the generated relative positional information
for the
at least some viewing locations.
130. The computer-implemented method of claim 129 wherein the self-
powered device is an aerial drone device, and wherein the self-powered mobile
device moving itself between the successive viewing locations in the sequence
including flying between the successive viewing locations in the sequence.
131. The computer-implemented method of claim 129 wherein the self-
powered device is at least one of a ground-based drone or robot device, and
wherein the self-powered mobile device moving itself between the successive
114
Date Recue/Date Received 2020-11-06

viewing locations in the sequence including moving along a floor or other
ground
surface between the successive viewing locations in the sequence.
132. The computer-implemented method of claim 129 wherein the self-
powered device operates under remote control of a user or operates sem i-
autonomously or operates fully autonomously, and wherein the self-powered
mobile
device moving itself between the successive viewing locations in the sequence
is
performed in response to instructions from the remote control or from semi-
autonomous operation or from fully autonomous operation.
133. A computer-implemented method comprising:
recording data with one or more mobile devices for a sequence of viewing
locations within an interior of a building, including:
automatically determining, for each of one or more of the viewing
locations, to provide a notification regarding use of the viewing location for
the
sequence, and providing, to a user associated with the one or more mobile
devices
and in response to the determining, the notification for the viewing location
on at
least one of the one or more mobile devices;
recording visual data of the interior from each of the viewing locations;
and
capturing linking information as the one or more mobile devices move
between successive viewing locations in the sequence;
creating, by one or more computing devices and from the recorded visual
data for the viewing locations, panorama images that include a panorama image
for
each viewing location with views from the viewing location for each of
multiple
directions;
generating, by the one or more computing devices and based at least in part
on the captured linking information, relative positional information between
at least
some of the viewing locations; and
providing, by the one or more computing devices, information for display
about the interior of the building that includes the created panorama images
and
115
Date Recue/Date Received 2020-11-06

that indicates at least some of the generated relative positional information
for the
at least some viewing locations.
134. The computer-implemented method of claim 133 wherein the
providing of the notification for a viewing location to the user on the at
least one
mobile device includes identifying the viewing location to the user based at
least in
part on a distance that the user has moved since a last viewing location.
135. The computer-implemented method of claim 133 wherein the
providing of the notification for a viewing location to the user on the at
least one
mobile device includes notifying the user of the viewing location based at
least in
part on one or more directions that the user has moved since a last viewing
location.
136. The computer-implemented method of claim 133 wherein the
automatic determining to provide a notification regarding use of a viewing
location
includes determining advisability of using the viewing location for the
sequence, and
wherein the providing of the notification for a viewing location to the user
on the at
least one mobile device includes notifying the user of the determined
advisability for
the viewing location.
137. The computer-implemented method of claim 133 wherein the
recording, for one of the viewing locations of the sequence, of the visual
data of the
interior for the one viewing location includes automatically determining, by
the one
or more mobile devices, that the one or more mobile devices have reached the
one
viewing location, and in response to the automatic determining that the one or
more
mobile devices have reached the one viewing location, automatically initiating
the
recording of the visual data for the one viewing location.
116
Date Recue/Date Received 2020-11-06

138. A computer-implemented method comprising:
recording data with one or more devices for a plurality of viewing locations
associated with multiple buildings co-located on a property, including
recording
visual data from each of the viewing locations including an interior of at
least one of
the multiple buildings, and including capturing linking information associated
with
movement between at least some of the plurality of viewing locations;
creating, by one or more computing devices and from the recorded visual
data for the viewing locations, panorama images that include a panorama image
for
each viewing location with views from the viewing location for each of
multiple
directions;
generating, by the one or more computing devices and based at least in part
on the captured linking information, relative positional information between
at least
some of the viewing locations; and
providing, by the one or more computing devices, information for display
about at least some of the multiple buildings that includes the created
panorama
images and that indicates at least some of the generated relative positional
information for the at least some viewing locations, and that includes
information
about the multiple buildings together in a linked manner.
139
The computer-implemented method of claim 138 wherein the multiple
buildings are part of one or more city blocks, and wherein the providing of
the
information includes providing information about at least one city block of
the one or
more city blocks.
140. The computer-implemented method of claim 138 wherein the multiple
buildings are located along one or more streets and/or roads, and wherein the
providing of the information includes providing information about at least one
street
and/or road of the one or more streets and/or roads.
117
Date Recue/Date Received 2020-11-06

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
CAPTURING, CONNECTING AND USING
BUILDING INTERIOR DATA FROM MOBILE DEVICES
TECHNICAL FIELD
[00011 The
following disclosure relates generally to techniques for
acquiring, analyzing and using information from an interior of a building in
order to generate and provide a representation of that interior, such as to
capture and analyze visual images and other sensor data from a mobile
device at multiple viewing locations in a house to generate and present
inter-connected panorama images of various locations within and
surrounding the house.
BACKGROUND
[0002] In various
fields and circumstances, such as real estate acquisition
and development, property inspection, architectural analysis, general
contracting, improvement cost estimation and other circumstances, it may
be desirable to view the interior of a house, office, or other building
without
having to physically travel to and enter the building. While traditional still

photographs of a building's interior may provide some understanding of that
interior, it is difficult to fully understand the layout and other details of
the
interior from such photographs. However, it
can also be difficult or
impossible to accurately and efficiently capture more immersive types of
visual information for building interiors, without spending significant time
and
using specialized equipment.
BRIEF DESCRIPTION OF THE DRAWINGS
[0003] Figures 1A-
1B are diagrams depicting an exemplary building interior
environment and computing system(s) for use in embodiments of the
present disclosure.
[0004] Figures 2A-2I illustrate examples of analyzing and using
information
acquired from an interior of a building in order to generate and provide a
representation of that interior.

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[0005] Figure 3 is
a block diagram illustrating a computing system suitable
for executing an embodiment of a system that performs at least some of the
techniques described in the present disclosure.
[0006] Figure 4 depicts a process flow for a Building Interior Capture
and
Analysis (BICA) system routine in accordance with an embodiment of the
present disclosure.
[0007] Figure 5 depicts a process flow for a building interior data
acquisition
routine in accordance with an embodiment of the present disclosure.
[0008] Figures 6A-B depict a process flow for a panorama connection
routine in accordance with an embodiment of the present disclosure.
[0009] Figure 7 depicts a process flow for a building interior
representation
presentation routine in accordance with an embodiment of the present
disclosure.
DETAILED DESCRIPTION
[0010] The present
disclosure relates generally to techniques for one or
more devices to perform automated operations involved in acquiring and
analyzing information from an interior of a house, building or other
structure,
for use in generating and providing a representation of that interior. For
example, in at least some such embodiments, such techniques may include
using one or more mobile devices (e.g., a smart phone held by a user, a
camera held by or mounted on a user or the user's clothing, etc.) to capture
video data from a sequence of multiple viewing locations (e.g., video
captured at each viewing location while a mobile device is rotated for some
or all of a full 360 degree rotation at that viewing location) within multiple

rooms of a house (or other building), and to further capture data linking the
multiple viewing locations. The capturing of the data linking two successive
viewing locations in the sequence may include, for example, capturing
movement data (e.g., acceleration and other data from an IMU, or inertial
measurement unit, of a mobile device) as a user with the mobile device
walks or otherwise moves between the two viewing locations, as well as
optionally recording video or other visual data for at least some of the user
movement. After the viewing location videos and linking information are
2

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
captured, the techniques may include analyzing video captured at each
viewing location to create a panorama image from that viewing location that
has visual data in multiple directions (e.g., a 360 degree panorama around a
vertical axis), analyzing the linking information to determine relative
positions/directions between each of two or more viewing locations, creating
inter-panorama positional/directional links in the panoramas to each of one
or more other panoramas based on such determined positions/directions,
and then providing information to display or otherwise present multiple
linked panorama images for the various viewing locations within the house.
Some or all of the techniques described herein may be performed via
automated operations of an embodiment of a Building Interior Capture and
Analysis ("BICA") system, as discussed in greater detail below.
[0011] Thus, in at least some embodiments, one or more processor-based
computing systems are used to capture and generate information regarding
a building environment (e.g., interior, exterior and/or surroundings) based on

recorded video information and/or sensor data captured by a mobile device
at each of multiple viewing locations within the building interior, as well as

based on sensor data (and possibly additional recorded video information)
captured during movement of the mobile device between such arbitrary
viewing locations. As used herein, the term "building" refers to any partially

or fully enclosed structure, typically but not necessarily encompassing one
or more rooms that visually or otherwise divide the interior space of the
structure - non-limiting examples of such buildings include houses,
apartment buildings or individual apartments therein, condominiums, office
buildings, commercial buildings or other wholesale and retail structures
(e.g., shopping malls and department stores), etc. The term "acquire" or
"capture" as used herein with reference to a building interior, viewing
location, or other location (unless context clearly indicates otherwise) may
refer to any recording, storage, or logging of media, sensor data, and/or
other information related to spatial and/or visual characteristics of the
building interior or subsets thereof, such as by a recording device or by
another device that receives information from the recording device. As used
herein, the term "panorama image" refers to any visual representation that is
based on, includes or is separable into multiple discrete component images
3

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
originating from a substantially similar physical location in different
directions and that depicts a larger field of view than any of the discrete
component images depict individually, including images with a sufficiently
wide-angle view from a physical location to include angles beyond that
perceivable from a person's gaze in a single direction. The term "sequence"
of viewing locations, as used herein, refers generally to two or more viewing
locations that are each visited at least once in a corresponding order,
whether or not other non-viewing locations are visited between them, and
whether or not the visits to the viewing locations occur during a single
continuous period of time or at multiple different time periods.
[0012] For illustrative purposes, some embodiments are described below
in
which specific types of information are acquired and used in specific types
of ways for specific types of structures and by using specific types of
devices. However, it will be understood that such described techniques may
be used in other manners in other embodiments, and that the invention is
thus not limited to the exemplary details provided. As one non-exclusive
example, various of the embodiments discussed herein include a mobile
device being carried by a user while the mobile device captures various
types of data, but in other embodiments one or more such mobile devices
may move within some or all of a building interior in other manners, such as
if carried by or integrated in an aerial or ground-based drone, robot or other

autonomous, semi-autonomous and/or remotely controlled device with
motion capabilities. As another
non-exclusive example, while some
illustrated embodiments include the linked panorama images representing
or covering a single house or other structure, in other embodiments the
linked panoramas may extend beyond a single such house or other
structure, such as to include links to and panorama images of (or other
visual representations of) an exterior environment associated with the
structure (e.g., yard; pool; separate garages, sheds, barns, pool houses,
boat houses, guest quarters or other outbuildings; etc.), of one or more
other nearby houses or other structures (e.g., on a same city block), of
nearby streets, roads and/or other areas, etc., as well as to include
apartment buildings, office buildings, condominiums and other multi-tenant
buildings or structures. As yet another non-exclusive example, while some
4

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
illustrated embodiments include linking and presenting multiple panorama
images, other embodiments may include linking and/or presenting other
types of information (whether in addition to or instead of such panorama
images), such as videos or other visual information from each of multiple
viewing locations that are in forms other than panorama images, information
based on infrared and/or ultraviolet and/or other non-visible light or energy
(e.g., radiation levels; electromagnetic field, or EMF, levels; etc.), audio
information from the environment surrounding a viewing location and/or
from other sources (e.g., a recording user's annotations or other verbal
descriptions), etc. ¨ for example, short recordings of a noise level may be
recorded at one or more recording locations within a building, such as under
different conditions (e.g., with windows open, with windows shut, etc.), at
different times, etc. As yet another non-exclusive example, while some
illustrated embodiments include linked panoramas or other generated
representation of a building interior (and/or other captured targets) on a
display of client device to an end user, visual and/or audio and/or other
information (e.g., haptic information) may be presented or otherwise
provided to end users in other manners, such as part of an augmented
reality ("AR") system (e.g., via specialized glasses or other head-mounted
display) and/or a virtual reality ("VR") system (e.g., via specialized
headgear and/or other output devices). In addition, various details are
provided in the drawings and text for exemplary purposes, but are not
intended to limit the scope of the invention. For example, sizes and relative
positions of elements in the drawings are not necessarily drawn to scale,
with some details omitted and/or provided with greater prominence (e.g., via
size and positioning) to enhance legibility and/or clarity.
Furthermore,
identical reference numbers may be used in the drawings to identify similar
elements or acts.
[0013] Figure 1A depicts a block diagram of an exemplary building
interior
environment and mobile computing system in accordance with an
embodiment of the present disclosure. In particular, Figure 1A includes a
building 199 with an interior to be captured by a user mobile computing
system 105 (also referred to as a "mobile device" for this example) as it is
moved through the building interior to a sequence of multiple viewing

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
locations 110 by a user (not shown) via a travel path 115. A Building
Interior Capture & Analysis ("BICA") system may automatically perform or
assist in the capturing of the data representing the building interior, as
well
as further analyze the captured data to generate a visual representation of
the building interior, as discussed further herein ¨ in this example, the BICA

system is executing as a local application 155 on the mobile device 105,
although in other embodiments it may execute in part or in whole on one or
more other computing systems (not shown) that are remote from the
building 199, as discussed in greater detail with respect to Figure 1B.
[0014] In the depicted embodiment, the mobile device 105 includes one
or
more hardware processors 130; one or more imaging systems 135, which
include photographic and video recording capabilities; a display system 140,
which includes a main display screen having a plurality of graphical display
elements, and may further include other components of the mobile device
(such as one or more light-emitting elements aside from the main display
screen); a control system 145, such as to include an operating system,
graphical user interface ("GUI"), etc.; and one or more sensor modules 148,
which in the depicted embodiment include a gyroscope module 148a, an
accelerometer module 148b, and a compass module 148c (e.g., as part of
one or more IMU units of the mobile device). In other embodiments, the
sensor modules 148 may include additional sensors, such as an altimeter
module, light detection module, one or more microphones, etc., and other
output modules (e.g., one or more speakers or audio output ports) may be
provided. In at least some embodiments, the display system 140 may
include a touchscreen component of the control system 145, such that at
least some operations of the mobile device may be controlled by physical
user interaction with elements of a graphical user interface presented via
the display system. The mobile device as depicted further includes a
memory 150, which in the illustrated embodiment is executing the BICA
application 155, and may optionally also be executing a browser application
160, although in other embodiments the device that captures the video
and/or other sensor data for the building interior may transfer the captured
data to one or more other devices (not shown) executing a copy of the BICA
application for analysis. In one or more embodiments, additional
6

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
components or applications may also be executing within the memory 150
of the mobile device.
[0015] In operation, a user associated with the mobile device 105
enters
the building interior 199 via travel path 114, arriving with the mobile device

at a first viewing location 110A within a first room of the building interior.
In
response to one or more interactions of the user with the control system 145
of the mobile device, the BICA application initiates recording a first video
of
the building interior, capturing a view of the building interior from first
viewing location 110A (e.g., some or all of the first room, and optionally
small portions of one or more other adjacent or nearby rooms, such as
through doors, halls or other connections from the first room) as the mobile
device is rotated around a vertical axis at the first viewing location (e.g.,
with
the user turning his or her body in a circle while holding the mobile device
stationary relative to the user's body). In addition to recording video, the
BICA application may monitor, and/or initiate concurrent recording of,
various data provided by the sensor modules 148. For example, the BICA
application may monitor a rotational speed of the mobile device via data
provided by the gyroscopic module and/or accelerometer module; may
associate with the recorded video a heading reported by the compass
module at the time the video recording is initiated; etc. In certain
embodiments, the BICA application may analyze one or more video frames
captured during the recording process to determine and/or automatically
correct issues regarding the recorded video, such as to correct or
compensate for an undesirable level of exposure, focus, motion blur, or
other issue. Furthermore, in certain scenarios and embodiments, a viewing
location may be captured in other manners, including to capture multiple still

photographs from different perspectives and angles at the viewing location
rather than recording video data at the viewing location.
[0016] In certain embodiments, the BICA application may provide real-
time
feedback to the user of the mobile device via one or more guidance cues
during the recording of the first video of the building interior, such as to
provide guidance for improving or optimizing movement of the mobile device
during the recording process. For example, the BICA application may
determine (such as based on sensor data provided by sensor modules 148)
7

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
that the mobile device is rotating too quickly to record high quality video
from the first viewing location, and if so may provide an auditory, visual, or

other appropriate notification to indicate that the user should rotate the
mobile device more slowly during the recording process. As another
example, the BICA application may determine that the mobile device is
shaking or otherwise failing to provide high quality video (such as based on
sensor data or one or more analyses of particular captured video frames),
and if so may provide a notification to advise the user of the problem. As
still another example, in certain embodiments the BICA application may
provide a notification to the user if it is determined that a particular
viewing
location is unsuitable for capturing information about the building interior,
such as if the BICA application detects that lighting conditions or other
environmental factors for the present viewing location are negatively
affecting the recording process. In certain scenarios and embodiments, the
BICA application may re-initiate the recording process once one or more
conditions interfering with high-quality recording have been alleviated.
[ow 7] Furthermore, in certain embodiments the BICA application may
prompt a user for information regarding one or more of the viewing locations
being captured, such as to provide a textual or auditory identifier to be
associated with a viewing location (e.g., "Living Room," "Office,' "Bedroom
1" or other identifier), or to otherwise capture descriptive information from
the user about the room (e.g., a description of built-in features, a history
of
remodels, information about particular attributes of the interior space being
recorded, etc.). In other
embodiments, such identifiers and/or other
descriptive information may be determined in other manners, including
automatically analyzing video and/or other recorded information for a
building (e.g., using machine learning) for the determination. In at least one

embodiment, such acquired or otherwise determined identifiers and/or other
descriptive information may be later incorporated in or otherwise utilized
with the captured information for a viewing location, such as to provide a
textual or auditory indication of the identifier or other descriptive
information
during subsequent display or other presentation of the building interior by
the BICA application or system (or by another system that receives
corresponding information from the BICA application).
8

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[0018] In one or
more embodiments, the BICA application may further
determine to modify one or more parameters of the imaging system 135 as
part of improving quality of or otherwise improving some or all video
recorded during capture of a building interior. For example, in certain
scenarios the BICA application may automatically determine to use one or
more of various exposure, aperture, and focus parameters; and may
automatically adjust one or more parameters based on a type of lens or
lenses used by the imaging system, such as if the imaging system includes
multiple lenses of different focal lengths or to compensate for an atypical
lens type (e.g., "fisheye," wide-angle, or telephoto lenses), and/or may use
an external camera (e.g., a 360 camera that acquires data in at least
360'in a single frame or otherwise simultaneously). The BICA application
may also optionally initiate presentation of user feedback (e.g., display of
one or more GUI elements to the user; use of audio and/or tactile feedback,
whether instead of or in addition to visual information, etc.) to suggest
parameters of the imaging system for modification by the user in order to
improve video recording quality in a particular embodiment or situation (e.g.,

if the BICA application is unable to automatically modify such parameters).
In addition, in some embodiments, the capture of some or all of the video at
one or more viewing locations may use additional equipment to assist in the
capture, such as one or more of a tripod, additional lighting, a 3D laser
scanner and rangefinder (e.g., using LIDAR) or other depth finder, one or
more additional and/or external lenses, an external camera (e.g., a 360
camera), an infrared emitter and/or detector, an ultraviolet emitter and/or
detector, one or more external microphones, etc.
[0019] In various circumstances and embodiments, the BICA application
may determine that multiple rotations of the mobile device at a viewing
location are desirable to adequately capture information there. As non-
limiting examples, the BICA application may determine to record video
having a greater dynamic range, such as by initiating multiple rotations of
the mobile device at different exposure values; or to capture a greater
vertical arc of the building interior, such as by initiating multiple
rotations of
the multiple device with distinct z-angles (e.g., one rotation in a lateral
direction that is approximately perpendicular to the vertical axis; another
9

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
rotation in which the vertical angle of the device is raised above that
perpendicular direction, such as to include at least some of the ceiling;
another rotation in which the vertical angle of the device is lowered below
that perpendicular direction, such as to include at least some of the floor;
etc.). In such circumstances, the BICA application may provide one or more
notifications or instructions to the user of the mobile device in order to
indicate the desirability of such multiple rotations.
[0020] In at least some embodiments, at a time after initiating the
recording
of the first video of the building interior in the first room, the BICA
application
may automatically determine that the first viewing location 110A has been
adequately captured, such as by determining that a full rotation of the
mobile device has been completed, or that sufficient data is otherwise
acquired. For example, the BICA application may determine that the
reported heading of the mobile device has returned to or passed a heading
associated with the beginning of the video recording, that the mobile device
has rotated a full 360 since video recording was initiated, that the user has
stopped rotation for a defined period of time (e.g., a small number of
seconds, such as after being prompted by the BICA application to stop the
rotation for that amount of time when the rotation is complete), etc.. In at
least some embodiments, the BICA application may provide one or more
guidance cues to the user of the mobile device to indicate that a capture of
the building interior from the first viewing location 110A is completed and
that the user may proceed to additional viewing locations within the building
interior. It will be appreciated
that in certain scenarios, capture of a
particular viewing location may not require a full 3600 rotation of the mobile

device in order to be adequately completed. For example, viewing locations
in close proximity to walls or corners may be adequately represented by
only a partial such rotation of the mobile device. Furthermore, in certain
scenarios and embodiments, a BICA application or system may create a
panorama image for a particular viewing location without the mobile device
105 completing a full rotation while recording video from that viewing
location. In such scenarios, the
BICA application or system may
compensate for the partial rotation in various manners, including but not
limited to: limiting a number of component images to include in the

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
panorama image if a disparate quantity of video information is recorded
from the viewing location for other portions of the building interior;
generating one or more interpolated component images that do not wholly
correspond to a single video frame recorded from the viewing location; or
other manner, and with the resulting panorama image optionally being less
than 360 degrees.
[0021] Continuing the example of Figure 1A, once the first viewing location

110A has been captured in the first room, the mobile device 105 moves
along travel path 115 as the user carries it to a next viewing location 110B,
which in this example is in a different second room through 2 doors and an
intervening hallway. As the mobile device is moved between viewing
locations, the BICA application captures linking information that includes
acceleration data associated with the movement of the mobile device, such
as that received from accelerometer module 148b, and in certain
embodiments may capture additional information received from other of the
sensor modules 148, including to capture video or other visual information
along at least some of the travel path in some embodiments and situations.
In various embodiments, and depending upon specific configuration
parameters of sensor modules 148, disparate quantities of acceleration data
may be collected corresponding to movement of the mobile device along
travel path 115. For example, in certain scenarios acceleration data and
other sensor data may be received from the sensor modules at regular
periodic intervals (e.g., 1000 data points a second, 100 data points a
second, 10 data points a second, 1 data point a second, etc.), while other
scenarios and/or sensor modules may result in such sensor data being
received irregularly, or at varying periodic intervals. In this manner, the
BICA application may receive greater or lesser quantities of acceleration
data during travel of the mobile device between viewing locations depending
on the capabilities and configuration of the particular sensor modules 148
included within the particular mobile device 105.
[0022] In one or more embodiments, the BICA application may further
determine to terminate video recording for a viewing location in various
manners (such as based on automatic detection of movement away from
the viewing location, on one or more defined user preferences, on an explicit
11

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
user request, on a full rotation of the mobile device or period of non-
movement or other determination that the viewing location is adequately
captured, etc. In other scenarios, the BICA application may continue video
recording without termination between capturing video of a viewing location
and subsequent movement of the mobile device along travel path 115 ¨ in
such embodiments, the BICA application may associate with the captured
video (either at the time of recording or during later analysis of such
captured video, described elsewhere herein) one or more indications of
demarcation ("markers" or "separation points") corresponding to a detected
change between receiving sensor data indicative of rotation around a
vertical axis (typically associated with capturing of a viewing location) and
receiving sensor data indicative of lateral or vertical movement typically
associated with movement between such viewing locations), optionally after
a defined period of substantially no movement. The BICA application may
further determine to maintain video recording until receiving an indication
that all capture of a building interior has been completed (such as
completion of video recording for a final viewing location within the building

interior). It will be appreciated that during the course of multiple segments
of movement through a building interior at and between multiple viewing
locations, the BICA application may determine to maintain and utilize
continuous video recording during all segments of such movement, one or
more individual/contiguous segments of such movement, or no segments of
such movement at all. In at least some embodiments, such determination
may be based on one or more of defined user preferences, configuration
parameters, available resources (such as storage capacity or other
resources) of the mobile device 105, a quantity or type(s) of sensor data
captured during such movement, or other factors.
[0023] In addition, and in a manner similar to the guidance cues and
other
instructions provided during capture of viewing location 110A, the BICA
application may in certain embodiments provide guidance cues and other
instructions to a user during movement of the mobile device between
viewing locations. For
example, in certain embodiments the BICA
application may notify the user if such movement has exceeded a defined or
suggested distance from the previous viewing location, or if the user is
12

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
attempting to capture a next viewing location that is determined by the
BICA application to be too close to the previous viewing location, or if the
user is engaging in too much movement of a particular type (e.g., sideways
rather than forward). Furthermore, in an manner analogous to video
recording for a viewing location, the BICA application may determine to
terminate video recording for a travel path between viewing locations in
various manners (such as based on a period of non-movement at the end of
the travel path or other determination that the travel path is adequately
captured, on an explicit user request, on one or more defined user
preferences, etc.).
[0024] Continuing the illustrated example of Figure 1A, once the mobile
device has arrived at the next viewing location 110B, the BICA application
may determine to begin capture of the viewing location. If video is currently
being recorded, the BICA application may associate with the captured video
(either at the time of recording or during later analysis of such captured
video) one or more markers corresponding to recording of a new viewing
location (e.g., based on a determined period of non-movement after the
movement to the new viewing location is completed; on a detected change
in receiving sensor data indicative of lateral or vertical movement between
viewing locations and receiving sensor data indicative of rotation around a
vertical axis; etc.). If video is
not currently being recorded, the BICA
application may in certain embodiments automatically initiate such video
recording upon detection that the user has begun to rotate the mobile
device, in response to a user request to begin capturing the next viewing
location (such as via one or more interactions of the user with the BICA
application and/or imaging system 135 via the control system 145), or in
other manners as previously noted.
[0025] In a manner similar to that described with respect to viewing
location
110A, the BICA application captures viewing location 110B by recording
video during rotation of the mobile device around a vertical axis at viewing
location 110B, optionally modifying imaging system parameters and
providing guidance cues or other instructions to the user of the mobile
device in order to improve the recorded video associated with the viewing
location. Upon determination that the viewing location 110B has been
13

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
adequately captured (either automatically or in response to a user request
as described above with respect to the capture of viewing location 110A), in
certain embodiments the BICA application may receive a user request to
terminate or to continue capture of the building interior, such as via one or
more user interactions with a graphical user interface provided by the BICA
application or in some other manner (e.g., user interaction with elements of
control system 145). For example, in accordance with one or more
embodiments and/or defined user preferences, the BICA application may
determine to continue capture of the building interior unless a user request
indicating otherwise is received; in other embodiments or in accordance with
other defined user preferences, the BICA application may automatically
terminate capture of the building interior unless and until user interaction
is
received indicating that one or more additional viewing locations (and linking

information during movement to such additional viewing locations) is to be
captured.
[0026] In the depicted embodiment of Figure 1A, additional viewing
locations 110C-110L, as well as linking information gathered during
movement between such viewing locations, are captured by the BICA
application as the user moves the mobile device 105 through building 199
interior along travel paths 115. Upon conclusion of capturing recorded video
corresponding to rotation around a vertical axis located at a last viewing
location 110L in the sequence, the BICA application determines to terminate
(such as in response to a user request) the capture of the building interior.
While the sequence of viewing locations and associated travel path do not
include any overlap in this example (e.g., with one portion of the travel path

crossing another portion of the travel path; one viewing location being the
same as or overlapping with another viewing location, such as to have a
loop in which the last viewing location is the same as the first viewing
location or other viewing location; etc.), other embodiments and situations
may include one or more such overlaps. Similarly, while the sequence of
viewing locations are traveled in a continuous manner in this example, other
embodiments and situations may include a non-contiguous path ¨ as one
non-limiting example, the user in Figure 1A could stop after travelling from
viewing locations 100A-110F, and complete the sequence by resuming at
14

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
viewing location 110L and returning back to viewing location 110F along the
intervening portion of the travel path 115 (resulting in a different order of
viewing locations for the sequence than the one shown in Figure 1A),
whether substantially immediately or after an intervening period of time has
passed.
[0027] In at least some embodiments, either immediately upon
terminating
the capture of building interior or at a later time, a panorama image is
generated for each of viewing locations 110A-101L based on one or more
analyses of the respective video recording corresponding to each such
viewing location. Various operations may be performed on individual frames
of such a video recording as part of generating a corresponding panorama
image. Non-
limiting examples of such operations include sharpening,
exposure modification, cropping, integration of multiple exposures (such as
if multiple rotations using distinct exposure parameters were used in order to

expand a dynamic range of the recorded video, or instead one or more
parameters are dynamically modified during a single rotation), deblurring
(such as to compensate for detected motion blur), and selective discarding
of particular video frames (such as based on a determination that such
frames are out of focus, over- or under-exposed, duplicative of other video
frames, or on other criteria). Once the individual frames of the video
recording have been selected and modified in accordance with the
operations described above, the resulting images are stored by the BICA
system as a single panorama image, such as to include multiple navigable
component images.
[0028] In addition to generating panorama images corresponding to each
of
the viewing locations within the building interior, analysis of the linking
information corresponding to each segment of travel path 115 is performed
in order to determine relative positional information between at least
successive pairs of viewing locations along that travel path. In particular,
acceleration data corresponding to each such segment is analyzed to
determine, for example, a relative location of viewing location 110B with
respect to previous viewing location 110A (and vice versa), with viewing
locations 110A and 110B being a first pair of successive viewing locations; a
relative location of viewing location 110C with respect to previous viewing

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
location 110B (and vice versa), with viewing locations 110B and 110C
being a second pair of successive viewing locations; and so on. In at least
some embodiments, additional sensor data may be considered during such
analysis. For example, for building interiors encompassing multiple floors or
other elevations, in addition to analyzing vertical acceleration data to
determine a relative vertical distance between viewing locations, the BICA
system may additionally make such determination based on available
altimeter data, gyroscopic data, etc. In addition, recorded video captured as
part of the linking information or as part of capturing a particular viewing
location may be analyzed as part of determining the relative positional
information. For example, in certain embodiments individual video frames
within separate segments of recorded video, corresponding to video
recorded from separate viewing locations, may be analyzed to determine
similarities between such video frames - for example, one or more video
frames recorded as part of capturing viewing location 110E may be
compared with one or more additional video frames recorded as part of
capturing viewing location 110F as part of determining relative positional
information regarding those viewing locations, as discussed in greater detail
with respect to Figure 2A. It will be appreciated that while analysis of the
linking information may only directly result in relative positional
information
between successive viewing locations along travel path 115 (e.g., between
viewing locations 110D and 110E, or viewing locations 110G and 110H), a
full analysis of such linking information may in certain embodiments
indirectly result in the BICA system determining relative positional
information between additional viewing locations as well (e.g., between
viewing locations 1101 and 110G, or viewing locations 110B and 110L) , as
discussed in greater detail with respect to Figure 2B.
[0029] In one or more embodiments, generating a panorama image for a
viewing location may include determining one or more component images to
use as primary component images of the panorama image, such as to
initially display when the panorama image is first presented to a user.
Various criteria may be utilized by the BICA system when determining
primary component images for a generated panorama image, including as
non-limiting examples: a component image that includes a view of a quantity
16

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
of other viewing locations within the building interior, a component image
determined to be of higher quality than other component images within the
generated panorama image (such as based on a depth of field, exposure,
lighting quality, or other attribute); etc. ¨ thus, selection of a primary
component image may be unrelated to the sequence of video frames
originally recorded from the viewing location corresponding to the generated
panorama image. In certain scenarios and embodiments, multiple primary
component images may be selected when generating a panorama image,
such as to reflect a respective direction from which a viewer might arrive at
the corresponding viewing location from other viewing locations within the
building interior. With reference to Figure 1A, for example, the BICA system
may determine to select a first primary component image of the panorama
image for viewing location 110E that corresponds to the perspective of a
viewer arriving from viewing locations 110A or 110D (i.e., an image based
on video recorded while the mobile device 105 was facing approximately
away from viewing location 110D), and may determine to select a second
primary component image of the 110E panorama image that corresponds to
the perspective of a viewer arriving from viewing location 110F (i.e., an
image based on video recorded while the mobile device 105 was facing
approximately toward the wall of the room in the direction of viewing location

1101).
[0030] In the depicted embodiment of Figure 1A, the generation of the
panorama images and determination of relative positional information is
performed locally by the one or more processors 130 via BICA application
155 executing in memory 150 of the mobile device 105. In various
embodiments, some or all of such processing may be handled by one or
more remote server computing systems executing an embodiment of a
Building Interior Capture and Analysis system, as discussed in greater detail
with respect to Figure 1B below.
[0031] Figure 1B is a schematic diagram of an additional exemplary
building interior environment being captured via embodiments of a BICA
system. In particular, in the depicted embodiment of Figure 1B, a mobile
device 185 (optionally executing an embodiment of a BICA client
application) is utilized to capture each of multiple viewing locations 210A-
17

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
210H within the interior of building 198, as well as to capture associated
linking information during movement of the mobile device 185 between such
viewing locations. In addition, the depicted embodiment of Figure 1B further
includes a remote BICA server system 260 and associated storage 280,
details and operations of which are further described below.
[0032] In a manner similar to that described with respect to building
199 of
Figure 1A, during capture of the interior of building 198, the BICA
application
(e.g., a local client application executing on mobile device 185, the remote
BICA system 260 via communication over the network(s) 170, etc.) provides
guidance cues and manages video recording while a user of the mobile
device rotates the mobile device around a vertical axis at each of a
sequence of viewing locations 210A-210H. Furthermore, and also in a
manner similar to that described with respect to building 199 of Figure 1A,
the BICA application captures linking information during movement of the
mobile device between the viewing locations 210 along travel path 215. In
the depicted embodiment of Figure 1B, the captured linking information
includes sensor data provided by sensor modules of the mobile device
(including acceleration data) and further includes additional video recording
captured during such movement.
[0033] Following the capture of a last viewing location 210H in the
sequence, the BICA application receives an indication from the user that
capture of the building 198 interior is complete. In the depicted embodiment
of Figure 1B, the captured information regarding building 198 is transferred
for processing to the remote BICA system 260 via one or more computer
networks 170 (e.g., as initiated by a local BICA client application, if any,
on
the mobile device; as initiated by a user of the mobile device; as initiated
by
the remote BICA system 260, such as periodically or as is otherwise
initiated; etc.). In various
embodiments, such contact and ensuing
information transmission may be performed at various times. For example,
the BICA application may allow the user to schedule the transmission for a
specified time, such as to conserve battery power for the mobile device 185
by restricting transmissions to time periods in which the mobile device is
externally powered, or to delay transmitting until a particular network
connection is available (e.g., in order to utilize a local Wi-Fi connection
for
18

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
such transmission rather than a cellular network connection, such as to
lessen or remove the impact of the transmission on a limited "data plan" of
the user), or to delay transmitting until the mobile device is docked or
otherwise can use a non-wireless physical connection.
[0034] In certain scenarios and embodiments, portions of the captured
information for a building interior may be transmitted at different times for
subsequent processing. For example, video recordings captured at some or
all of the viewing locations for a building interior may be transmitted
independently of any linking information captured during movement of the
mobile device between such viewing locations, or vice versa. As another
example, one or more portions of captured information for a building interior
may be transmitted prior to fully completing the capture of all viewing
locations within that building interior, such as to enable the remote BICA
system 260 to generate corresponding panorama images for such viewing
locations concurrently with the capture of additional building interior
information, to determine relative positional information for certain viewing
locations concurrently with the capture of additional building interior
information, and/or to analyze the transmitted portions of the captured
information to determine and provide notification of any problems with those
transmitted portions. In this manner, the BICA system may provide a
notification to the user that one or more of the viewing locations should be
recaptured while the user is still within the building interior, such as if
the
BICA system determines during processing of the corresponding video
recordings for those viewing locations that such video recordings are of
insufficient or undesirable quality to serve as the basis for generating a
panorama image, or do not appear to provide complete coverage of the
building (e.g., if only 1 of 3 expected bathrooms have been captured, such
as based on a floor plan or other information that is available about the
building).
[0035] In the depicted implementation of Figure 1B, the BICA system 260
includes a Building Interior Data Acquisition manager 262 (for managing the
acquisition, receipt and storage of captured media, sensor data, and other
information related to building interiors); a Panorama Generation manager
266 (for managing analysis of received media, including generation of
19

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
panorama images for viewing locations based on received video data); a
Panorama Connection manager 264 (for managing analysis of received
sensor data and other information, including to determine relative positional
information regarding related viewing locations of a building interior); and
Building Interior Representation Presentation manager 268 (for presenting
linked panoramas or other generated representations of building interiors,
such as via a GUI provided by the BICA system, or for otherwise providing
such information to other systems for display, such as via an API, or
application programming interface, not shown, provided by the BICA system
for programmatic access by remote executing software programs). The
BICA system is communicatively coupled (locally or remotely) to storage
facility 280, which includes database 286 with acquired building interior data

(e.g., videos or other visual information for viewing locations; linking
information between viewing locations, such as video data and/or other
sensor data; etc.), database 282 with generated linked panorama building
information, and user information database 284 with various user-specific
information (e.g., user preferences). In certain implementations, the storage
facility 280 may be incorporated within or otherwise directly operated by the
BICA system; in other implementations, some or all of the functionality
provided by the storage facility may be provided by one or more third-party
network-accessible storage service providers (e.g., via network 170 and/or
one or more other intervening networks, not shown).
[Dom Continuing the example of Figure 1B, processing of the captured
information regarding the building 198 interior is performed in a manner
similar to that described with respect to the processing of captured
information regarding the building 199 interior of Figure 1A. In particular, a

panorama image is generated for each of viewing locations 210A-210H
based on one or more analyses of the respective video recording
corresponding to each such viewing location, with various operations
performed on individual frames of each such video recording as part of
generating a corresponding panorama image. Analysis of the linking
information corresponding to each segment of travel path 215 is performed
in order to determine relative positional information between successive
viewing locations 210 along that travel path, including an analysis of

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
acceleration data (and any additional sensor data) corresponding to each
such travel path segment to determine a relative location of viewing location
210B with respect to previous viewing location 210A; a relative location of
viewing location 2100 with respect to previous viewing location 21013; and
so on, as further discussed with respect to Figures 2A-2B.
[0037] In various scenarios and embodiments, specific aspects of the
processing of the captured information may be performed by the remote
BICA system 260, by a local BICA client application (not shown) executing
on mobile device 185, or both. For
example, the local BICA client
application may analyze captured sensor data in order to insert one or more
markers into corresponding video information recorded during capture of the
building interior, such as to separate the recorded video information into
portions respectively corresponding to the capture of each viewing location
within the building interior and other portions respectively corresponding to
the capture of linking information during movement between those viewing
locations. In this manner, transmission and/or analysis of the captured
information may be performed in an apportioned manner rather than as a
single unit. As another example, the remote BICA system 260 may
generate a panorama image for each of the viewing locations within a
building interior, while a local BICA client application executing on mobile
device 185 may analyze the captured linking information in order to
determine relative locations for such viewing locations, or vice versa. It
will
be appreciated that in various embodiments, any combination of local and
remote processing of the captured information regarding a building interior
may be performed by one or both of the remote BICA system and local
BICA client application, or that instead only one of the remote and local
applications may be used.
[0038] In the depicted computing environment 180 of Figure 1B, the
network 170 is one or more publicly accessible linked networks, possibly
operated by various distinct parties, such as the Internet. In other
implementations, the network 170 may have other forms. For example, the
network 170 may instead be a private network, such as a corporate or
university network that is wholly or partially inaccessible to non-privileged
users. In still other implementations, the network 170 may include both
21

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
private and public networks, with one or more of the private networks
having access to and/or from one or more of the public networks.
Furthermore, the network 170 may include various types of wired and/or
wireless networks in various situations. In
addition, in this illustrated
example of Figure 1B, users may utilize client computing systems and/or
other client devices (such as mobile device 185) to interact with the BICA
system 260 to obtain various described functionality via the network 170,
and in doing so may provide various types of information to the BICA
system. Moreover, in certain implementations, the various users and
providers of the networked environment 180 may interact with the BICA
system and/or one or more other users and providers using an optional
private or dedicated connection.
[0039] Again with reference to Figure 1B, once respective panorama
images have been generated for each of the viewing locations 210A-210H
and relative positional information with respect to those viewing locations
has been determined (e.g., one or more of location, distance, rotation
relative to a viewing location's panorama image's starting point or other
direction information, etc.), the BICA system 260 generates a presentation
of the building 198 interior based on the respective panorama images and
relative positional information to create a group of linked panorama images.
In particular, based on the relative positional information, the BICA system
associates each panorama image (which corresponds to a single one of
viewing locations 210) with additional information reflecting one or more
links to one or more other of the viewing locations. For example, in the
depicted embodiment, the BICA system might associate the generated
panorama image corresponding to viewing location 210G with links
respectively associated with each of viewing locations 210C, 210E, 210F,
and 210H. In certain embodiments, the BICA system may determine to
associate a panorama image with links corresponding to each additional
viewing location within the building interior that satisfy one or more defined

criteria. As non-limiting examples, such criteria for associating a link may
include whether the viewing location corresponding to the link is visible at
least in part from the viewing location that corresponds to the panorama
image; whether the viewing location corresponding to the link is within a
22

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
defined proximity to the viewing location that corresponds to the panorama
image; whether sufficient information is available to determine the relative
position or direction of the viewing location from the viewing location that
corresponds to the panorama image; or other suitable criteria. Note that, as
described above with respect to the generated panorama image
corresponding to viewing location 210G, link may be associated with a
panorama image such that the associated links include links corresponding
to viewing locations other than those that were consecutively captured
during the original capture process - for example, during the capture of
building 198 interior, viewing location 210G was immediately preceded
along travel path 215 by viewing location 210E and immediately followed by
viewing location 210H, and yet links may be associated for the 210G
panorama image that correspond to any or all of viewing locations 210A,
210B, 210C, 210D, and 210F as well.
[0040] In certain embodiments, generating a presentation of the building
198 interior may include determining an initial panorama image to display as
a "starting point" of the presentation. It will be appreciated that the
initial
panorama image selected by the BICA system may or may not correspond
to the first viewing location for the original capture of the building
interior
(i.e., viewing location 210A for the building 198 interior in Figure 1B). For
example, the BICA system may determine to designate an initial panorama
image that corresponds to a viewing location visible from the most other
viewing locations within the building interior; that corresponds to a viewing
location within a particular type of room within the building interior (e.g.,
a
building lobby, and entryway, a living room, a kitchen, or other type); that
corresponds to a viewing location within a room of the building interior that
appears to encompass a greatest square footage; or that corresponds to a
viewing location satisfying other defined criteria.
[0041] In addition to the automated generation of the representation of the

building 198 interior (including generation of panorama images and
determination of inter-panorama links based on determined relative position
information between corresponding viewing locations), the described
techniques may in at least some embodiments include enabling the user
carrying the mobile device and/or one or more other users (e.g., operators
23

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
of the BICA system 260) to modify the generated representation of the
building interior in various manners, such as via a corresponding GUI
provided by the BICA system. Such modification may include, for example,
adding, deleting and/or changing determined inter-panorama links (e.g., to
adjust links to pass through doorways or other passages between rooms
rather than through walls; to add or remove links corresponding to end user
expectations of related viewing locations; etc.). In
addition, such
modification may further include other changes, such as changing
panorama images (e.g., removing a panorama image if its viewing location
is not useful or if there are other problems with the panorama image;
selecting a new starting image/direction in a panorama image when it is
initially displayed; selecting a new starting panorama image to display for
the building interior; etc.), adding or otherwise modifying textual and/or
audio annotations or descriptions for particular panoramas images and/or
inter-panorama links, etc.
[0042] Once a presentation for a building interior is generated, the
BICA
system stores the presentation for future use (e.g., in linked panorama
building information database 282 or other component of storage 280 as
depicted within Figure 1 B), such as to respond to one or more later requests
to display the presentation for a specified building interior that has been
previously captured. For example, in response to a request from an end
user to display a presentation for building 198 interior, the BICA system 260
may retrieve the relevant presentation information from storage 280 and
selectively transmit some or all of such presentation information to a client
computing system (not shown) of the end user (such as via network 170).
In so doing, the BICA system provides information that causes the client
computing device to display an initial panorama image corresponding to a
determined first viewing location within the building interior ¨ for example,
viewing location 210C may be selected as the first viewing location for the
presentation, with an initial primary component image of the panorama
image associated with viewing location 210C being displayed to the end
user.
[0043] In at least some embodiments, the display of the panorama image
is
performed in a user-navigable manner, such as to allow the user to
24

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
determine a sequence, direction, and/or rate of display of additional
component images of the generated panorama image. For example, in
certain embodiments the user may navigate the panorama image by using
an input device (a mouse, touchscreen, virtual-reality display, or other input

device) to selectively "turn" within the panorama image, such that the BICA
system causes the client computing system to display one or more additional
component images or other portions of the panorama image in accordance
with the received user input. In addition, the presentation as a whole is
navigable by the user via selection of the respective link information
associated with one or more other viewing locations (and other
corresponding panorama images) by the BICA system when generating the
presentation information ¨ in this manner, the user may navigate the entirety
of the presentation for a building interior via selection of displayed link
during
display of panorama images, such as to initiate display by the BICA system
of other corresponding panorama images associated with other viewing
locations within the building interior to which the selected links correspond.
[0044] Figures 2A-2B illustrate examples of analyzing and using information

acquired from an interior of a building in order to generate and provide a
representation of that interior, including to determine relative positional
information between the viewing locations for use in inter-connecting
panorama images or other visual information corresponding to those
viewing locations.
[0045] In particular, Figure 2A illustrates building 198 in a manner
similar to
that illustrated in Figure 1B, but with additional information shown in room
229 of the building that may be of use in determining connections between
panorama images for different viewing locations. In the example of Figure
2A, the room 299 includes various structural details that may be visible in
images (e.g., video frames) captured from viewing locations (e.g., viewing
locations within the room 229, such as 210A, 210B, and/or 210C), such as
multiple doorways 190 (e.g., with swinging doors), multiple windows 196,
multiple corners or edges 195 (including corner 195-1 in the northwest
corner of the building 198, as shown with respect to directional indicator
209), etc. In addition to the structural information in the room 229, the
illustrated example further includes additional furniture and other contents
in

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
the room 229 that may similarly be used in matching at least some images
from different viewing locations, such as a couch 191, chairs 192, a table
193, etc. Furthermore, the building 198 in this example also includes an
object 194 on the eastern wall of the building 198, such as may be visible
from viewing location 210C (e.g., corresponding to a painting, picture,
television, etc.). It will be appreciated that other structural and/or non-
structural features may be present and used in image matching in other
buildings in other embodiments.
[0046] In addition to building 198, Figure 2A further includes information
201 and 202 to demonstrate examples of using information about
overlapping features in frames from two panorama images at two viewing
locations in order to determine inter-connection information for the
panorama images. In particular, information 201 further illustrates room 229
and how features in the room may be used for image matching for viewing
locations 210A and 2100, such as based on structural and/or contents (e.g.,
furniture) features of the room. As non-exclusive illustrative examples,
Information 201 illustrates viewing directions 227 from viewing location 210A
that each has an associated frame in the panorama image for that viewing
location, with the illustrated viewing directions 227 corresponding to various

features in the room 229. Similarly, information 201 also illustrates viewing
directions 228 from viewing location 210C that each has an associated
frame in the panorama image for that viewing location, with the illustrated
viewing directions 228 corresponding to the same features in the room 229
as the viewing directions 227. Using feature 195-1 in the northwest corner
of the room 229 as an example, a corresponding viewing direction 227A and
associated frame in the direction of that feature from viewing location 210A
is shown, and a corresponding viewing direction 228A and associated frame
from viewing location 210C to that feature is also shown ¨ given such
matching frames/images to the same feature in the room from the two
viewing locations, information in those two frames/images may be compared
in order to determine a relative rotation and translation between viewing
locations 210A and 2100 (assuming that sufficient overlap in the two
images is available). It will be appreciated that multiple frames from both
viewing locations may include at least some of the same feature (e.g.,
26

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
corner 195-1), and that a given such feature may include additional
information to that feature (e.g., portions of the west and north walls, the
ceiling and/or floor, possible contents of the room, etc.) ¨ for the purpose
of
this example, the pair of frames/images being compared from the two
viewing locations corresponding to feature 195-1 may include the
image/frame from each viewing location with the largest amount of overlap,
although in actuality each image/frame from viewing location 210A in the
approximate direction of 227A that includes any of corner 195-1 may be
compared to each image/frame from viewing location 210C in the
approximate direction of 228A that includes any of corner 195-1 (and
similarly for any other discernible features in the room).
[0047] Information 202 of Figure 2A provides further illustration of how
the
frames/images in directions 227A and 228A may be used, along with the
other matching frames/images between the two viewing locations, in order
to determine inter-panorama directions and links to connect the panorama
images for the two viewing locations. In particular, information 202 includes
representations of viewing locations 210A and 210C, illustrating the
directions 227A and 228A from those viewing locations to the structural
feature 195-1 of the room 229. The viewing location 210A representation
further illustrates that the video capture of information for the panorama
image from viewing location 210A begins in direction 220A and, as shown in
the information 222A, proceeds in a clockwise manner corresponding to a
360 full rotational turn around a vertical axis, resulting in 150
frames/images being acquired from the viewing location 210A (e.g., 6
frames per second, if the full 360 rotation takes 25 seconds, although other
amounts of rotation time and/or frames per second may be used in other
situations, such as faster or slower rotation times and/or more or less
frames per second). After determining the image/frame of the panorama
image for viewing location 210A that includes the feature 195-1 for the
purpose of the image/frame matching in this example, information 224A
further illustrates that the image/frame is frame 133 of the 150 frames, and
it
is 320 from the beginning direction 220A. In a similar manner, the visual
information captured from viewing location 210C begins in direction 220C,
and, as shown in information 222C, proceeds in an almost full rotation
27

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
around a vertical axis at that viewing location, corresponding to 3550 of
rotation and 148 frames/images captured. After
determining the
image/frame of the panorama image for viewing location 210C that includes
the feature 195-1 for the purpose of the image/frame matching in this
example, information 2240 further illustrates that the image/frame is frame
108 of the 148 frames, and it is 260 from the beginning direction 220C.
[0048] Based on the analysis of the matching pair of frames/images, the
relative rotation between the directions 227A and 228A may be used to
determine that the viewing locations 210A and 210C are located in direction
226 from each other (shown in this example as a single 2-way direction,
such as to include a direction 226a, not shown, from viewing location 210A
to viewing location 210C, and an opposite direction 226B, also not shown,
from viewing location 2100 to viewing location210A), as well as a distance
(not shown) for the translation between the viewing locations. Using the
determined direction 226, a corresponding inter-panorama link 225A-C is
created (in direction 226a) for the panorama image from viewing location
210A to represent viewing location 210C and its panorama image, with
information 223A indicating that the resulting rotation from starting
direction
220A is 84 and is centered at frame 35 of the 150 frames (with 15 frames
in each direction also including viewing location 210C, resulting in frames
20-50 of viewing location 210A's panorama image including a displayed
inter-panorama link in direction 226a to the associated panorama image for
viewing location 210C). Similarly, using the determined direction 226, a
corresponding inter-panorama link 225C-A is created (in direction 226b) for
the panorama image from viewing location 210C to represent viewing
location 210A and its panorama image, with information 2230 indicating that
the resulting rotation from starting direction 220C is 190 and is centered at

frames 77 and 78 of the 148 frames (with 15 frames in each direction also
including viewing location 210A, resulting in frames 63-93 of viewing
location 210C's panorama image including a displayed inter-panorama link
in direction 226b to the associated panorama image for viewing location
210A).
[0049] While the direction 226 is discussed in information 202 with
respect
to a single pair of frames 133 and 108 (from viewing locations 210A and
28

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
210C, respectively), it will be appreciated that viewing locations such as
210A and 210C in a single room (or otherwise with direct viewing
information to each other) will typically have numerous pairs of matching
frames/images that each include overlapping information, and may each be
used to similarly determine respective values for the relative positional
rotation and translation between the two viewing locations (such as based
on the directions 227 and 228 in information 201 as a small example subset
of matching frames/images). As discussed in greater detail below, the
information from multiple such matching frames/image pairs may be
combined in order to determine an overall relative rotation and translation
between the two viewing locations, with the confidence in the resulting
overall values typically growing as the number of matching frames/images to
be analyzed increases. In other embodiments, image mapping may be
performed using only structural features, only furniture or other objects
within a room, and/or one or both of those types of information in
combination with other additional types of feature information that are
discernible in images from different locations.
[0050] The example information 202 further illustrates additional inter-
panorama connections to other viewing locations from viewing location
210C that may be determined based on overlapping corresponding
matching frames/images from viewing location 210C and those other
viewing locations, with resulting generated inter-panorama links being
shown. In particular, the additional inter-panorama links include an inter-
panorama link 225C-B in a determined direction to viewing location 210B,
an inter-panorama link 225C-D in a determined direction to viewing location
210D (e.g., if sufficient image overlap is available for images from both
viewing locations along the north wall of the hallway moving east-west
through the building 198), and an inter-panorama link 225C-G in a direction
to viewing location 210G (e.g., if sufficient information overlaps in the
images from the two viewing locations along that hallway, along the eastern
wall of the building 198 where object 194 is present, and/or along the
western wall of the building to include images of chairs 192, table 193, and
nearby window 196). While it is possible that sufficient overlap may be
present from other viewing locations to that of viewing location 210C to
29

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
enable relative positional information to be determined from overlapping
image information, such as for one or more of viewing locations 210E, 210F,
and/or 210H, the general lack of overlap in visual information from the
respective viewing locations may prevent such a determination using that
information in the current example, and one or both of viewing locations
210D and 210G may similarly lack sufficient information to determine their
respective inter-panorama links (or to determine the directions for such
inter-panorama links with only low confidence values). Also, while viewing
location 210A includes only a single inter-panorama link 225A-C in
information 202 in this example, it will be appreciated that an additional
inter-panorama link between viewing locations 210A and 210B may be
determined in a manner similar to that discussed with respect to that of
viewing locations 210A and 210C.
[0051] Figure 2B continues the example of Figure 2A, and in particular
illustrates information 203 regarding similar types of inter-panorama rotation

and distance information that may be determined corresponding to the
viewing locations from which the panorama images are taken, but with the
determination in Figure 2B being based on analyzing linking information
corresponding to a travel path that a user takes between viewing locations
(whether in addition to or instead of using connection information determined
from image/frame mapping as discussed with respect to Figure 2A).
[0052] In particular, the information 203 of Figure 2B illustrates viewing
locations 210A, 210B, and 210C, and also shows travel path information
235a indicating a path of the user moving from viewing location 210A to
viewing location 210B, and travel path information 235b indicating a path of
the user subsequently moving from viewing location 210B to 210C. It will be
appreciated that the order of obtaining the linking information may vary,
such as if the user instead started at viewing location 210B and captured
linking information as he or she traveled along path 235b to viewing location
210C, and later proceeding from viewing location 210A to viewing location
210B along travel path 235a with corresponding linking information captured
(optionally after moving from viewing location 210C to 210A without
capturing linking information). The information 203 includes some of the
information 202 previously illustrated in Figure 2A, and includes some

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
additional information (e.g., regarding viewing location 210B), but some
details are omitted in Figure 2B relative to Figure 2A for the sake of clarity
-
for example, information 220A is shown to illustrate the starting direction
from which the video data is captured at viewing location 210A, but details
such as information 222A about the number of frames and degrees of
coverage captured for the resulting panorama image are not illustrated.
[0053] In addition, information 203 of Figure 2B illustrates additional
details
about the user travel paths 235a and 235b, such as to indicate that the user
departs from the viewing location 210A at a point 237 in a direction that is
just west of due north (as illustrated with respect to directional indicator
209), proceeding in a primarily northward manner for approximately a first
half of the travel path 235a, and then beginning to curve in a more easterly
direction until arriving at an incoming point 238 to viewing location 210B in
a
direction that is mostly eastward and a little northward. In order to
determine the departure direction from point 237 more specifically, including
relative to the video acquisition starting direction 220A for viewing location

210A, initial video information captured as the user travels along travel path

235a may be compared to the frames of the panorama image for viewing
location 210A in order to identify matching frames/images (in a manner
similar to that discussed with respect to Figure 2A and elsewhere for
comparing frames/images from different viewing locations) ¨ in particular, by
matching one or more best frames in that panorama image that correspond
to the information in the initial one or more video frames/images taken as
the user departs from point 237, the departure direction from point 237 may
be matched to the viewing direction for acquiring those matching panorama
images. While not illustrated, the resulting determination may correspond to
a particular degree of rotation from the starting direction 220A to the one or

more matching frames/images of the panorama image for that departure
direction. In a similar manner, in order to determine the arrival direction at

point 238 more specifically, including relative to the video acquisition
starting
direction 220B for viewing location 210B, final video information captured as
the user travels along travel path 235a may be compared to the frames of
the panorama image for viewing location 210B in order to identify matching
31

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
frames/images, and in particular to frames/images in direction 239
(opposite to the side of viewing location 210B at which the user arrives).
[0054] While such departure direction and arrival direction would match the

actual relative direction 232 between the viewing locations 210A and 210B
(with direction 232 being a two-way direction in a manner similar to that of
direction 226 of Figure 2A, including the direction of inter-panorama link
225A-B from viewing location 210A to 210B and the direction of inter-
panorama link 225B-A from viewing location 210B to 210A) if the travel path
235a was completely straight, that is not the case here. Instead, in order to
determine the direction 232, acceleration data captured as part of the linking

information for the travel path 235a is analyzed to identify user velocity and

location along the travel path 235a, in order to model the resulting relative
locations of the travel path between starting point 237 and arrival point 238.

Information 206 and 207 illustrates examples of such analysis of
corresponding acceleration data captured along the travel path 235a, with
information 206 corresponding to acceleration and velocity in a north-south
direction, and information 207 corresponding to acceleration and velocity in
an east-west direction - while not illustrated here, in some embodiments
further information will be determined for acceleration and velocity in a
vertical direction, such as to manage situations in which a user ascends or
descends stairs or otherwise changes a vertical height (e.g., along a ramp)
as he or she moves along the travel path. In this example, referring to
information 206 corresponding to the north-south direction, the acceleration
data acquired (e.g., from one or more IMU units in a mobile device carried
by the user) illustrates that there is an initial significant acceleration
spike in
the northerly direction as the user began moving, which then drops to near
zero as the user maintains a constant velocity in a generally northern
direction along the middle portion of the travel path 235a, and then begins a
longer but less sharp acceleration in the southerly direction as the user
curves to a primarily easterly direction toward viewing location 210B and
decelerates at arrival. As discussed in greater detail elsewhere herein, the
acceleration data is integrated to determine corresponding north-south
velocity information, as further illustrated in information 206, and is then
further integrated to determine location information for each data point (not
32

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
shown in information 206 in this example, but corresponding to the
illustrated travel path 235a). By combining the determined velocity and
location information, an amount of north-south movement by the user along
travel path 235a may be determined, corresponding to an aggregate
amount of north-south distance traveled between viewing locations 210A
and 210B. In a similar manner, information 207 illustrates acceleration and
velocity information in an east-west direction for the travel path 235a as the

user moves along the travel path, with the resulting double integration in
velocity and location data providing an aggregate amount of east-west
distance that the user travels along the travel path 235a. By combining the
aggregate north-south and east-west distances (and assuming in this
example that no height change occurred) with the determined departure and
arrival information, a total distance traveled between viewing locations 210A
and 210B in a corresponding direction 232 is determined.
[0055] While a similar user travel path 235b is illustrated from viewing
location 210B to 210C, with similar acceleration data captured as part of its
linking information, corresponding acceleration and velocity information is
not illustrated for the travel path 235b in a manner analogous to that of
information 206 and 207. However, based on a similar analysis of departing
direction from viewing location 210B, arrival direction at viewing location
210C, and intervening velocity and location for some or all data points for
which acceleration data is captured along the travel path 235b, the user's
movement for travel path 235b may be modeled, and resulting direction 231
and corresponding distance between viewing locations 210B and 210C may
be determined. As a result, inter-panorama link 225B-C may be generated
for the panorama image generated at viewing location 210B in a direction
231 to viewing location 210C, and similarly, inter-panorama link 225C-B
may be determined for the panorama generated at viewing location 210C in
direction 231 to viewing location 210B.
[0056] Despite the lack of linking information captured between viewing
locations 210A and 210C (e.g., because the user did not travel along a path
between those viewing locations, because linking information was not
captured as a user did travel along such a path, etc.), information 203
further illustrates an example of direction 226 that may optionally be
33

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
determined between viewing locations 210A and 210C based on the
analysis of linking information for travel paths 235a and 235b (and with
corresponding inter-panorama links 225A-C and 225C-A in direction 226).
In particular, even if an absolute location of viewing locations 210A, 210B
and 210C are not known from the analysis of the linking information for
travel paths 235a and 235b, relative locations of those viewing locations
may be determined in a manner discussed above, including distances and
directions between viewing locations 210A and 210B and between viewing
locations 210B and 210C. In this manner, the third side of the resulting
triangle having determined lines 232 and 231 may be determined to be line
226 using geometrical analysis, despite the lack of direct linking information

between viewing locations 210A and 210C. It will be further noted that the
analysis performed with respect to travel paths 235a and 235b, as well as
the estimation of direction and distance corresponding to 226, may be
performed regardless of whether or not viewing locations 210A, 210B and/or
210C are visible to each other ¨ in particular, even if the three viewing
locations are in different rooms and/or are obscured from each other by
walls (or by other structures or impediments), the analysis of the linking
information may be used to determine the relative locations discussed
above (including directions and distances) for the various viewing locations.
It will be appreciated that the techniques illustrated with respect to Figures

2A and 2B may be continued to be performed for all viewing locations in
building 198, resulting in a set of linked panorama images corresponding to
viewing locations 210A-H, or otherwise in other similar buildings or other
structures.
[0057] The following discussion, including with respect to corresponding
Figures 2C-2I, provides example details regarding particular embodiments
for determining inter-panorama connection information ¨ however, it will be
appreciated that the details presented are for illustrative purposes, and
other embodiments may be performed in other manners.
[0058] As discussed in greater detail with respect to Figure 2A and
elsewhere, connections between at least some panorama images may be
determined in part or in whole based on matching frames/images
corresponding to those panorama images. Such techniques may be of
34

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
particular use if the scene is rich in visual texture/features, and the two
panoramas' viewing locations have direct line-of-sight to each other.
[0059] Consider, as an example, two panorama images 0 and 1, with
panorama image 0 including a sequence of frames 1-00, 1-01, 1-02, 1-03, ... I-
Om and having respective angles a-00, a-01, a-02, a-03, ... a-Om with
respect to that panorama image's starting video acquisition direction, and
with panorama image 1 including a sequence of frames 1-10, 1-11, 1-12, 1-13,
... I-1n and having respective angles a-10, a-11, a-12, a-13, ... a-1n with
respect to that panorama image's starting video acquisition direction. The
results of analyzing the matching frames/images between the panorama
images includes determining whether the two panorama images are visually
connected, and if so, what is the orientation angle A-01 in panorama image 0
toward panorama image 1, and what is the orientation angle A-10 in panorama
image 1 toward panorama image 0.
[0060] As one technique for calculating such orientation angles A-01
and A-
10, every frame from panorama image 0 is compared with every frame from
panorama image 1, to see if they are visually connected. So if there are m
frames in panorama image 0, and n frames in panorama image 1, m x n
comparisons will be performed. For each comparison of such an image
pair, a check is performed of whether the two images have sufficient visual
feature matches to determine relative position information. To do so, visual
feature locations of each of the two images are first detected, such as by
using one or more of existing SIFT, MSER, FAST, KAZE, etc. feature
detectors. Feature
descriptor vectors are then calculated around the
detected feature location neighborhood to describe the feature, such as by
using one or more of existing SIFT, BRIEF, ORB, AKAZE etc. feature
descriptors. A check is then made between the two images in the image
pair for whether a feature descriptor from one image has a similar feature
descriptor in the other image, and if so that feature pair forms a putative
feature pair ¨ in so doing, a feature descriptor is similar to another feature

descriptor, when the descriptors have a short distance in the vector space
(e.g., below a defined distance threshold, such as using L2 distance, L1
distance, Hamming distance for binary descriptors, etc.), and a frame pair

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
has enough putative feature matches if they satisfy or exceed a defined
feature match threshold.
[0061] Comparison of an image pair may in some embodiments include
computing a homography matrix (referred to as "H" in this example) and/or
an essential matrix (referred to as "E" in this example). If two images I-0i
and 1-1 j of a putative matching image pair are looking at a planar surface
(2D
surface in the 3D world, such as a wall with some pictures on it), and if 4
pairs
of putative feature matches exist given the matching locations on the images,
a Homography matrix H can be recovered such that for any pair of features
(p0x, pOy) in I-0i and (plx, ply) in I-1j, H can be applied to (p0x, pOy) to
directly compute the location of the corresponding feature (plx, ply) in I-1j.
If
more than 4 pairs of putative feature matches exist that are all true matching

features, a least square solution of H can be computed - in addition, if some
of the putative matches are outliers, Random Sampling Consensus algorithm
("RANSAC") can be performed to achieve a robust estimation of H. If the two
images I-01 and I-1 j of a putative matching image pair are looking at a scene

with 3D objects rather than a 2D surface (e.g., a room corner where two
walls and a floor meet), and if 5 pairs of putative feature matches exist
given
the matching locations on the images, an Essential matrix E can be
recovered such that for any pair of features (p0x, pOy) in I-0i and (plx, ply)

in 1-1 j, (p0x, pOy) from I-0i can be mapped with E to the neighborhood of
(plx, ply) in I-1j. The neighborhood is defined as closeness to the epipolar
lines of (plx, ply) in I-1j, with those epipolar lines defined as lines
connecting (plx, ply) and the epipole in I-1 j, where the epipole is the
projection of 1-0i's camera center (the optical center 3D location of the
camera which took the picture of I-0i) onto the image of I-1 j. If more than 5

pairs of putative feature matches exist are all true matching features, a
least
square solution of E can be computed - n addition, if some of the putative
matches are outliers, RANSAC can be performed to achieve a robust
estimation of E. Once the H or E matrix is computed, the quantity of feature
pairs that are actually inliers can be counted, and if smaller than a defined
threshold (e.g., 15), the image pair is discarded for further evaluation as
being unlikely to be a valid pair looking at the same region of a scene.
Given Essential matrix E or Homography matrix H and the camera
36

CA 03069813 2020-01-13
parameters (intrinsics) which took the pictures I-01, and I-1j, E or H can be
decomposed into a relative rotation 3-by-3 matrix R and relative translation
3-by-1 vector T between the two camera locations (there may be up to four
mathematical solution sets of the decomposition, at least two of which may
further be invalidated if point correspondences are available by applying
positive depth constraint, if all points are in front of both cameras).
Additional
details for computing H from corresponding feature locations, for computing E
from corresponding feature locations, for performing least square solutions,
for performing RANSAC, and for decomposing matrix E into matrix R and
vector T are included in Multiple View Geometry in Computer Vision, 2nd
Edition,
Richard Hartley and Andrew Zisserman, Cambridge University Press, 2004.
Additional details for decomposing matrix H into matrix R and vector T are
included in Deeper Understanding Of The Homography Decomposition For
Vision-Based Control, Ezio Malis and Manuel Vargas, Research Report
RR-6303, INRIA, 2007, pp.90.
[0062] Since information about whether an image pair is looking at a 2D
planar
surface (e.g. a wall) or a full 3D scene (e.g. a room corner) is not typically

available in advance of analysis of the image pair, both H and E matrices of
any given image pair are computed in some embodiments. The remaining
solution sets can be further evaluated in two aspects: (1) reprojection error,

in which given a pair of rotation and translation and feature correspondence
locations on the two images, the 3D feature locations can be computed using
a method called triangulation; and (2) rotational axis check, in which the
relative rotation between any two given image pairs should be around a
vertical rotational axis if users are holding the cameras vertically, and any
solution set that does not have a rotation whose rotational axis is close to a

vertical direction can be filtered out. The basic idea of reprojection error
is to
project the viewing rays of the feature back into the 3D space, with the 30
location being where the two viewing rays of the same feature from the two
cameras meet or intersect, and with further details regarding performing
robust
triangulation available in Multiple View Geometry in Computer Vision, 2nd
Edition, as indicated above. The 3D points can then be reprojected onto the
images again, to check how close the reprojections are to the original
37

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
feature locations found in the feature detection step, and with the matrix R
and vector T solution set with the best performance selected as the mostly
likely true solution to the image pair. Solution
sets passing the
aforementioned two evaluation criterions are considered valid solutions, and
an image pair with at least one valid solution set is considered a valid image

pair for further angle computation.
[0063] Figure 2C illustrates a further example of image/frame matching
for
two panorama images (also referred to as "panos") 0 and 1. For example,
information 271 shows that, given a valid image pair with matrix R and vector
T
and the resulting triangulated 3D points, the location of the two panorama
centers
and the scene location can be determined, using the mean location of the
triangulated 3D points. The top-down relative angle between the two frames,
denoted as a, is computed, as well as the two angles 13 and y. The per-frame
top-down angle of I-0i is also known from the panorama starting (reference)
orientation, denoted as cp, and with I-1j's angle denoted as 8. Thus, the
orientation from panorama image 0 transitioning to panorama image 1 relative
to
panorama image O's reference orientation, denoted as 6 = cp + 6, can be
computed, as well as the orientation from panorama image 1 transitioning to
panorama image 0 relative to panorama image l's reference orientation, denoted

as w = 8 - y. Such a 6 and w can be computed from every valid image pair, with

all valid image pairs checked and used to create a histogram (approximated
distribution) of gathered Os and ws, as shown in information 272. Finally, the

consensus 6 angle and w angle from the distribution are chosen as the
transitioning direction from panorama image 0 to panorama image 1, and from
panorama image 1 to panorama image 0 respectively. In example information
272, the horizontal axis is the candidate orientation from pano 0 to pano 1,
ranging from 0 to 360, and the vertical axis shows the frame-pairs counts
that generate the corresponding orientation angle ¨ in this case, there are
two major angles that potentially are the orientation from pano 0 to pano 1
(i.e. around 0 degrees, and 75 degrees). This means the two hypotheses
are competing against each other, such as because the scene has some
repetitive features at different locations, and/or both orientations are valid

numerically in terms of homography/essential matrix decomposition. The
36

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
peak around 75 is selected as the final orientation for this example as the
more
likely solution.
[0064] As one example of an energy optimization process for a global
estimation of inter-panorama connection information (as discussed in
greater detail below), information 277 of Figure 2H illustrates example
variables and costs to be used for the optimization. In general, the example
costs attempt to minimize changes to the individually calculated information
values, while enforcing overall consistency (e.g., a first inter-panorama
connection angle between two panorama images that is calculated from
image/feature matching should be the same as a second inter-panorama
connection angle between those same two panorama images that is
calculated from using linking information, and vice versa; a calculated
location of a destination viewing location for a panorama image from
captured linking information should be the same as the actual location,
using a loose norm to account for linking information possibly not starting
and/or ending exactly at the respective viewing locations; calculated travel
path positions and turn angles from linking information should be the same
as actual, to minimize sharp turns and abrupt location changes; etc.).
Information 278 of Figure 21 shows one example in the right pane of the
effects of such an energy optimization process, relative to a contrasting
greedy algorithm illustrated in the left pane.
[0065] In addition, confidence values can further be determined for such
calculated inter-panorama connection angles from image/frame matching.
As one example, various factors may affect a visual connection between two
panorama images, and the resulting confidence value(s), such as the
following: (1) number of frames in each panorama image sequence,
reflecting an indirect indicator of speed of rotation, image blurriness, and
IMU signal smoothing; (2) angle between frame-pair viewing directions, with
both the intersection depth certainty and the potential perspective distortion

between corresponding features seen from the two views increasing with
the size of the angle; (3) per-frame matching inlier feature numbers,
modeling the texture richness of the viewing angles; (4) peak choice
confusion, corresponding to the number of peaks in sample orientation
consensus distribution (as the number of peaks increase, the likelihood
39

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
increases of choosing a wrong peak as a solution); and (5) sample circular
coverage, corresponding to the coverage of samples around the 360 degree
of a panorama circle that supports the final angle. In one
example
confidence value calculation technique, factors 4 and 5 are the primary
factors considered. For factor 4, the probability of the chosen peak being
correct is computed by marginalizing over likelihood of all peaks, with the
prior of all peaks assumed to be the same - the rationale is that when
multiple peaks are available in the distribution, the choice of the highest
peak is more probable to be a wrong choice than when there is only a single
peak. For factor 5, the probability of consensus angles being correct
increases as the coverage of samples around the 360 degree of a
panorama circle increases - the rationale is that if the chosen connection
angle gets support from multiple panorama angles from the scene rather
than a single direction (favoring a well-textured scene over a poorly-textured

one, as well as a same-room panorama connection than a connection
between two rooms that only connected through a narrow door, and thus
corresponding in part to factor 3 as well), it is more likely to be the true
direction of connection.
[0066] With respect to calculating an example confidence value, a Peak
Choice Confusion determination is first performed. Using the information
272 of Figure 2C as an example, all the dominant modes (in this case two
modes) are first found, and all samples' probability of being part of each
mode are marginalized, as follows:
P(mode) = P(mode I sample) * P(sam pie).
In the above equation, P(mode I sample) is represented as a rotational
Gaussian distribution (rotational distance to the central value), because
angles are periodical - for example, 359 degree's distance to 10 degree is
11 degrees. A heuristic standard deviation d is assigned to the Gaussian
model (10 degrees). In other words:
P(mode I sample) = 1/N *exp( - rotational_angle_diff A 2 / ( 2
*d) ), where N is a normalization term.

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
Once all P(mode) are computed, they are normalized by their sum, so that
they added up to 1. Given the above definition, the more dominant a mode
is, the more confidence results that the angle of that mode corresponds to
the correct pano-to-pano orientation or direction.
[0067] In addition to determining Peak Choice Confusion information,
Sample Circular Coverage information is also determined and used.
Consider an example of a top-down view of a room and two panoramas in it,
as shown in the left and right plots of information 273 of Figure 2D. Each
dotted line pair from the two panoramas represents a sample pair of frames
found in that specific direction, due to richness of textures in a
corresponding location in the room (represented by red dots on the room
wall). It is easy to see that the plot on the right has sample pairs from more

directions than that of the left plot (red sectors around the panorama
centers), and the more sampled angles across the whole 360 degree range
suggests the more reliable the determined aggregate pano-to-pano
orientation will be (from a distribution/consensus of orientations). The
Sample Circular Coverage analysis includes dividing a full 360 degree span
into 36 sectors, and then checking the coverage of those sectors for each
panorama image. Similar to the mode computation noted above, the impact
of all samples to each sector is computed as:
P(sector) = > P(sector I sample) * P(sample).
P(sector) is then thresholded (0.1) to generate a binary decision if the
sector is contributing to the final orientation. The number of sectors that
have positive contribution are counted and divided by the number of
sectors (here 36). By so doing, the orientation computed between two
panoramas that are only connected in a single location in the room (e.g., by
a painting on a textureless wall) is less robust than the orientation
computed between two panoramas that are inside a well-textured house
with a larger number of room locations used for matching.
[0068] A final confidence value is then determined for calculated inter-
panorama connection angles from image/frame matching by using a
multiplication of the two above factors (corresponding to Peak Choice
41

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
Confusion and Sample Circular Coverage), which model different aspects
related to confidence and are generally independent from the other. A pair
of panorama images has two determined inter-connection angles: an
outgoing angle from each pano to the other. Therefore, there is a
confidence score for each angle. However, the two angle confidences are
the same, since angles are computed by panorama to panorama mutual
matching. Accordingly, the determined direction 226 in information 201 of
Figure 2A is shown as a "double-ended" arrow.
[0069] The determination of confidence values for calculated inter-
panorama connection angles can further be determined differently when
using captured linking information for such calculations, rather than for
calculated angles from image/frame matching as discussed above. As
noted above, and with further details below, the calculating of inter-
panorama connection angles from captured linking information depends on
multiple factors, including the following: (1) frame visual matching from the
first panorama to the starting frames of the linking video, and the ending
frames of the linking video to the second panorama, with the accuracy of
relative angles between the panoramas to the linking videos depending on
the visual matching quality; (2) travel path length, in which IMU noise
impacting the double integration increases with the length; and (3)
straightness of the travel path, in which IMU noise impacting the double
integration increases with the number of turns or other deviations from a
straight travel path.
[0070] With respect to calculating an example confidence value based on
the use of captured linking information, a Visual Matching Weight
determination is first performed, with information 274 of Figure 2E as an
example. All frames in a panorama sequence are matched to a few frames
at the beginning/end of a linking video, such as based on using one or both
of the following to compute such visual matching: (1) feature-based visual
matching, with Nomography used as a geometric verification model; and (2)
a DeepFlow-based matching (e.g., if technique (1) does not produce reliable
matching result for any pair of pano-frame and linking-video frame, such as
due to lack of texture in the linking video frames from pointing to a white
wall
or for other reasons). The two techniques' reliability (prior) is set to 1.0
and
42

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
0.6 respectively (and referred to below as Matching Method Weight),
because the feature-based method is typically more robust than the
DeepFlow-based one if the feature-based method succeeds. For
feature-based visual matching, the consistency of angle directions from
different frame pairs is further taken into account, weighted by the number
of matches in each pair, as shown in information 274 of Figure 2E, with four
samples of angles between a frame in a panorama and a frame in a linking
video shown, using vector magnitude to indicate the number of matches in a
direction, and with the red arrow as the final chosen direction. The weighted
direction and its corresponding number of matches (magnitude of the red
vector) is mapped to a value within [0. 1], and referred to as the Visual
Matching Weight below, with the mapping being linear and clamped
between [10, 500].
[0071] In addition to determining Visual Matching Weight information,
information on Travel Path Length and Number Of Turns for the travel path
is also determined and used. The Travel Path Length Weight is modeled as
1 - num_frames * El, where El is a predefined error loss when
accumulating a new frame, currently set to 0.001, and with a minimum
weight clamped at 0.01. The Number of Turns Weight is modeled as 1 -
num_turns * E2, where E2 is a predefined error loss when accumulating a
new turn, currently set to 0.02, and with a minimum weight also clamped at
0.01. To compute the number of turns, the curvature of the IMU in the two
horizontal directions (referred to as "x" and "z" for the purpose of this
example, with "y" treated as the floor normal in this example IMU coordinate
system) for each linking video frame, skipping initial front and last end
(e.g.,
initial 12 frames and last 12 frames) because those frames are normally
stationary and any curvature should correspond to noise. The quantity of
peaks above a certain curvature threshold (e.g., 8) during the whole video
are then counted, and used as the number of turns.
[0072] A final confidence value is then determined for calculated inter-

panorama connection angles from captured linking information by using a
multiplication of the above factors (resulting in Matching Method Weight *
Visual Matching Weight * Travel Path Length Weight * Number of Turns
Weight). For this type of calculated inter-panorama connection angles, the
43

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
confidences for the two angles of a determined connection between two
panorama images are typically different, because the two outgoing angles
are computed asymmetrically (independent of each other), with the first
angle computed from the first panorama to linking video matching, and the
second angle computed from linking video to the second panorama
matching.
[0073] To make the comparison meaningful between confidence values
determined for calculated inter-panorama connection angles from
feature/image matching and from using captured linking information (e.g., so
that a global connection computation can use both types of connection
information, as discussed further below), a scaling between the two types of
confidence scores is recovered and used. As one example, this can be
performed empirically by creating a database of pano-connection
calculations as compared to actual results, allowing a relative confidence
scale between the two types of confidence values to be determined (which
can be thought of as the posterior probability: P(Angle I feature/image
matching) * P(feature/image matching) and P(Angle I captured linking
information) P(captured linking information).
[0074] In some embodiments, as part of determining whether
feature/image matching will be used to determine a possible inter-
connection between two panorama images, an initial optional visibility
estimation check is performed, to allow panorama pairs that do not share
any (or sufficient) visual content to be filtered before more detailed
feature/image matching is performed. This involves attempting to find, from
some corresponding points between the panoramas, if a geometric model
can fit them, using a random generation of putative models from a subset of
the corresponding points. Doing so involves a two-step procedure, with an
initial step of feature point registration, using a feature-based matching in
order to detect some putative corresponding point between the two
panoramas. A second step of robust model estimation is then performed,
where an attempt is made to fit a coherent geometric model for that point.
To do so, an attempt is made to robustly estimate a single axis (vertical
axis) rotation matrix, using a minimal solver that uses 1-point
correspondence (which makes sampling and solving fast), using ACRansac
44

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
(which minimizes the angular distance between the estimated model and
provided corresponding vectors). If this single vertical axis rotation matrix
fails, an attempt is made to estimate a homography matrix. If neither model
can be estimated, this pair of panorama images is filtered from use in the
feature/image matching.
[0075] When using captured linking information for a travel path between
two viewing locations to determine an inter-connection between the
panorama images for those viewing locations, acceleration data for the
user's movement along the travel path may be acquired and used, as
previously noted. For example, smart phones are typically equipped with
various motion sensors, often referred to in the aggregate as an inertia
measurement unit (IMU) ¨ such sensors may include an accelerometer
(used as a low-frequency pose estimator), a gyroscope (high-frequency
pose estimator), and/or a magnetometer (e.g., a compass), and are known
to work well for estimating the device's rotation.
[0076] However, using such IMU data to estimate the device's position has
been difficult or impossible, particularly without the use of specialized
sensors that are not typically part of smart phones. The task sounds
straightforward in theory ¨ given an estimation of the device's rotation, the
direction of gravity at every frame is known, and an integration can be
performed to obtain velocity (after subtracting gravity, a constant
acceleration on the earth, from the acceleration measures), and a second
integration performed to get the position. However, double-integration
operations are highly sensitive to noise and bias in the raw sensor
measurements, and a simplistic version of this operation may provide highly
incorrect and useless position estimates. Imagine you are going up in an
elevator from the 1st floor to the 30th floor in a high-rise building. You
feel
some vertical acceleration initially, then almost nothing along the way, then
some deceleration at the end ¨ without a visual indication within the elevator

of the floor, it can be impossible to accurately estimate the vertical
distance
traveled, particularly when the speed of an elevator may be different every
time and there may be significant perpendicular vibrations every second
(mimicking human steps).

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[0077] In at least
some embodiments, noise/bias is handled by first
modeling the bias. In particular, nonlinear least squares optimization may
be used to estimate sensor bias and produce accurate per-frame device
rotation. In this
example, the sensor bias is modeled in the global
coordinate frame, although other embodiments may instead model bias in
the local (sensor) coordinate frame. Let {ao, al, ...} denote the acceleration

measures minus the gravity in the global coordinate frame at the input
frames, with each symbol representing a 3D vector. Let us first consider
what kind of accelerations are expected for a simple straight walking motion
along a single lateral axis, as illustrated in the leftmost pane of 3 panes of

information 275 in Figure 2F. In the actual corresponding data shown in the
center pane, acceleration measurements for each of the three axes (XYZ)
are shown, with blue being the vertical acceleration (with frequent up and
down values, corresponding to footsteps), and green and red being the two
lateral/horizontal accelerations (which theoretically should look like the
ideal
acceleration in the left pane, but do not since they are extremely noisy) -
after simple smoothing, as shown in the rightmost pane, the green and red
data is more similar to the information of the leftmost pane, but would
nonetheless provide highly inaccurate data after double integration (e.g., off

by 10 meters after 10 seconds).
[0078] To correct the bias, it is first estimated as follows:
ai, ...}, af = af + 5(f),
where af denotes the refined acceleration at frame f, and 5(f) is the
estimated bias for frame f. This estimated bias is represented as a
piecewise linear model (see information 276 of Figure 2G). To do so,
several control points are used along the input frames, with each control
point having 3 variables to be estimated. The bias acceleration is obtained
by a simple linear combination of the surrounding control points at every
frame. The frequency of control points to be placed can be varied as
appropriate ¨ for example, if 20 control points are placed, the nonlinear
least squares optimization problem to be solved will have 60 (=20x3)
parameters subject to the constraints. The purple line in the left pane of
46

WO 2019/014620 PCT/US2018/042130
information 276 shows the piece-wise linear model of the bias acceleration,
with small purple circles representing control points, and the height of each
purple disk being adjusted so that the corrected accelerations and the
corrected integrated velocities follow certain constraints. The right pane
illustrates the constraints that are imposed, as discussed further below.
[0079] The selection of which constraints to enforce has a significant
effect on
the usefulness of double integration. In this example, the sum of squares of
these constraint equations (some use a robust norm) is minimized subject to
the bias terms, which are initialized to be 0, and with the ceres solver used
to
solve the problem in this example. The right pane of information 276
illustrates
four prior terms, as follows:. This is because we ask a user. Note that, {50,
100, ...} are just examples, and this interval can be specified as an input
parameter.
- accelerations must be 0 at the first few seconds and at the last few
seconds (based on asking a user to stop and stand still at the beginning and
the end of the acquisition), corresponding to for each f'G {param1, param2,
...),
af, = 0, where param1 and param2 are specified according to the length of the
non-movement interval;
- velocities (sum of accelerations from the start) must be 0 at the first
few
seconds and at the last few seconds (based on asking a user to stop and stand
still at the beginning and the end of the acquisition), corresponding to for
each
{param1, param2, ...}, = 0, where
fl
- 4 = -
vi af
f =1
- velocity direction must be the same as the device's z axis (using the
z axis as the lateral forward direction), corresponding to for each f'
{param1, param2, = = = }, 11C/fill (1 .0 ¨ * {Device-Z}) = 0, where Device-
Z is
obtained from the camera rotation, and if the velocity estimation at a frame
coincides with the direction of the device - while this assumption is often
true as a person normally walks straight, it may not be true
47
Date Recue/Date Received 2020-05-06

WO 2019/014620 PCT/US2018/042130
sometimes (e.g., when moving sideways to avoid obstacles), and therefore the
robust error metric (HuberLoss) is used;
- maximum velocity must not be more than a certain speed (e.g., with
velocity more than 1.5m/sec being penalized), corresponding to for each f'
{param1, param2, ...}, max(' I -1.5, 0) = 0;
- the norm of the velocities must be the same during the video (except for
some margin at the beginning and at the end), so as to penalize the
differences
of velocities at different sampled frames (e.g., if F frames are sampled,
there
would be (F choose 2) number of constraints), corresponding to for each f'
{param1, param2, ...}, for each g' E {param1, param2, ...}, 1117MI = WO; and
- the magnitude of the bias terms at control points should be small as
much as possible (enforced as a weak regularization term, in which
magnitudes are penalized), corresponding to I ( C(, CiY, Ciz)I = 0, where
(Cix,
Ciz) denotes the i_th acceleration bias variables.
[0080] As described above, initial inter-panorama connection information
based
on image/feature matching and/or by using captured linking information may be
determined for some panorama images in a group representing a building
interior or other structure, such as to use image/feature matching to
inter-connect nearby panorama images that have sufficient overlap in their
respective scenes, and/or to to captured linking information to inter-connect
successive panorama images based on the user's travel path between their
corresponding viewing locations. In at least some embodiments, an additional
analysis is subsequently performed (e.g., as an energy optimization process),
such as to verify the previously determined inter-connections on an overall
global basis, and/or to determine additional inter-panorama connections based
on the previously determined relative positions of previously unconnected
panorama images.
[0081] Figure 3 is a block diagram illustrating an embodiment of a server
computing system 300 executing an implementation of a BICA system 340 ¨
the server computing system and BICA system may be implemented using
48
Date Recue/Date Received 2020-05-06

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
a plurality of hardware components that form electronic circuits suitable for
and configured to, when in combined operation, perform at least some of
the techniques described herein. In the illustrated embodiment, the server
computing system 300 includes one or more hardware central processing
units ("CPU") or other processors 305, various input/output ("I/O")
components 310, storage 320, and memory 350, with the illustrated I/O
components including a display 311, a network connection 312, a computer-
readable media drive 313, and other I/O devices 315 (e.g., keyboards, mice
or other pointing devices, microphones, speakers, GPS receivers, etc.).
The server computing system 300 and executing BICA system 340 may
communicate with other computing systems via one or more networks 399
(e.g., the Internet, one or more cellular telephone networks, etc.), such as
BICA user computing systems 360 (e.g., used to capture building interior
data), client computing systems 380 (e.g., on which generated building
representations may be presented to end users), and other computing
systems 390.
[0082] In the illustrated embodiment, an embodiment of the BICA system
340 executes in memory 350 in order to perform at least some of the
described techniques, such as by using the processor(s) 305 to execute
software instructions of the system 340 in a manner that configures the
processor(s) 305 and computing system 300 to perform automated
operations that implement those described techniques. The illustrated
embodiment of the BICA system includes Building Interior Data Acquisition
manager component 342 (e.g., in a manner corresponding to Building
Interior Data Acquisition manager 262 of Figure 1B), Panorama Generation
manager component 348 (e.g., in a manner corresponding to Panorama
Generation manager 266 of Figure 1B), Panorama Connection manager
component 346 (e.g., in a manner corresponding to Panorama Connection
manager 264 of Figure 1B), Building Interior Representation Presentation
manager 344 (e.g., in a manner corresponding to Building Interior
Representation Presentation manager 268 of Figure 1B), and optionally
other components that are not shown, with the memory further optionally
executing one or more other programs and components 349. As part of
such automated operations, the BICA system 340 may store and/or retrieve
49

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
various types of data, including in the example database data structures of
storage 320, such as various types of user information in database ("DB")
322, acquired building interior data (e.g., viewing location visual data,
linking
information, etc.) in DB 326, generated building information (e.g., linked
panoramas, etc.) in DB 324, and/or various types of optional additional
information 328 (e.g., various analytical information related to presentation
or other use of one or more building interiors previously captured, analyzed,
and/or presented by the BICA system).
[0083] Some or all of the user computing systems 360 (e.g., mobile
devices), client computing systems 380, and other computing systems 390
may similarly include some or all of the types of components illustrated for
server computing system 300. As a non-limiting example, the user
computing systems 360 include hardware CPU(s) 361, I/O components 362,
storage 366, and memory 367. In the depicted embodiment, the user
computing systems 360 also include an imaging system 364, and both a
browser 368 and BICA client application 369 are executing within memory
367, such as to participate in communication with the BICA system 340
and/or other computing systems.
[0084] It will be appreciated that computing system 300 and other systems
and devices included within Figure 3 are merely illustrative and are not
intended to limit the scope of the present invention. The systems and/or
devices may instead each include multiple interacting computing systems or
devices, and may be connected to other devices that are not specifically
illustrated, including via Bluetooth communication or other direct
communication, through one or more networks such as the Internet, via the
Web, or via one or more private networks (e.g., mobile communication
networks, etc.). More generally, a device or other computing system may
comprise any combination of hardware that may interact and perform the
described types of functionality, optionally when programmed or otherwise
configured with particular software instructions and/or data structures,
including without limitation desktop or other computers (e.g., tablets,
slates,
etc.), database servers, network storage devices and other network devices,
smart phones and other cell phones, consumer electronics, wearable
devices, digital music player devices, handheld gaming devices, PDAs,

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
wireless phones, Internet appliances, and various other consumer products
that include appropriate communication capabilities. In
addition, the
functionality provided by the illustrated BICA system 340 may in some
embodiments be distributed in various components other than those
specifically illustrated, some of the illustrated functionality of the BICA
system 340 may not be provided, and/or other additional functionality may
be available. In addition, in certain implementations, various functionality
of
the BICA system may be provided by third-party partners of an operator of
the BICA system - for example, generated building interior representations
may be provided to other systems that present that information to end users
or otherwise use that generated information, data collected by the BICA
system may be provided to a third party for analysis and/or metric
generation, etc.
[0085] It will also be appreciated that, while various items are
illustrated as
being stored in memory or on storage while being used, these items or
portions of them may be transferred between memory and other storage
devices for purposes of memory management and data integrity.
Alternatively, in other embodiments some or all of the software components
and/or systems may execute in memory on another device and
communicate with the illustrated computing systems via inter-computer
communication. Thus, in some embodiments, some or all of the described
techniques may be performed by hardware means that include one or more
processors and/or memory and/or storage when configured by one or more
software programs (e.g., the BICA system 340 and/or BICA client software
executing on user computing systems 360 and/or client computing devices
380) and/or data structures, such as by execution of software instructions of
the one or more software programs and/or by storage of such software
instructions and/or data structures. Furthermore, in some embodiments,
some or all of the systems and/or components may be implemented or
provided in other manners, such as by consisting of one or more means that
are implemented at least partially in firmware and/or hardware (e.g., rather
than as a means implemented in whole or in part by software instructions
that configure a particular CPU or other processor), including, but not
limited
to, one or more application-specific integrated circuits (ASICs), standard
51

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
integrated circuits, controllers (e.g., by executing appropriate instructions,

and including microcontrollers and/or embedded controllers), field-
programmable gate arrays (FPGAs), complex programmable logic devices
(CPLDs), etc. Some or all of the components, systems and data structures
may also be stored (e.g., as software instructions or structured data) on a
non-transitory computer-readable storage mediums, such as a hard disk or
flash drive or other non-volatile storage device, volatile or non-volatile
memory (e.g., RAM or flash RAM), a network storage device, or a portable
media article (e.g., a DVD disk, a CD disk, an optical disk, a flash memory
device, etc.) to be read by an appropriate drive or via an appropriate
connection. The systems, components and data structures may also in
some embodiments be transmitted via generated data signals (e.g., as part
of a carrier wave or other analog or digital propagated signal) on a variety
of
computer-readable transmission mediums, including wireless-based and
wired/cable-based mediums, and may take a variety of forms (e.g., as part
of a single or multiplexed analog signal, or as multiple discrete digital
packets or frames). Such computer program products may also take other
forms in other embodiments. Accordingly, embodiments of the present
disclosure may be practiced with other computer system configurations.
[0086] Figures 4-7 depict exemplary automated operations for acquiring,
analyzing and presenting information regarding a building, such as may be
performed in some embodiments in part or in whole by a BICA system (e.g.,
by the BICA application 155 of Figure 1A, by one or more components of
the BICA system 260 of networked environment 180 depicted by Figure 1B,
and/or the BICA system 340 executed by the server computing system 300
of Figure 3). In particular, Figure 4 depicts an example overview process
flow for a BICA routine; Figure 5 depicts an example process flow for a
building interior data acquisition routine; Figures 6A-B6 depict an example
process flow for a panorama connection routine; and Figure 7 depicts an
example process flow for a building interior representation presentation
routine.
[0087] Figure 4 illustrates an example flow diagram for an embodiment of a
Building Interior Capture and Analysis routine 400. The routine may be
performed by, for example, execution of the BICA application 155 of Figure
52

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
1A, the BICA system 260 of Figure 1B, the BICA system 340 of Figure 3,
and/or the BICA system discussed with respect to Figures 2A-2I, such as to
acquire, analyze and use information about a building interior, including to
generate and present a representation of the building interior. While the
illustrated embodiment acquires and uses information from the interior of a
target building, it will be appreciated that other embodiments may perform
similar techniques for other types of data, including for non-building
structures and/or for information external to one or more target buildings of
interest.
[0088] The
illustrated embodiment of the routine begins at block 405, where
instructions or information are received. At block
410, the routine
determines whether the received instructions or information indicate to
acquire data representing a building interior. If so, the routine proceeds to
block 415 in order to perform a building interior data acquisition subroutine
(with one example of such a routine illustrated in Figure 5, as discussed
further below) in order to acquire data representing the interior of a target
building of interest. If it was determined at block 410 that the received
instructions or information did not indicate to acquire data representing a
building interior, the routine proceeds to block 420 to determine whether the
received instructions or information included building interior data that has
already been acquired, such as a transmission of previously captured
building interior data from a remote mobile device (e.g., a mobile device
executing a local instance of a BICA application), with the transmission of
acquired building interior data from mobile device 185 to BICA system 260
via network(s) 170 of Figure 1B illustrating one example of such data
acquisition. Following block 415, or after a determination is made in block
420 that the received instructions or information include acquired building
interior data, the routine proceeds to block 417 in order to store the
acquired
data.
[0089] After block 417, the routine proceeds to block 425, in which
(whether
via local processing, remote processing, or some combination thereof) a
panorama image is generated for each viewing location of the captured
building interior based on a corresponding recorded video or other acquired
visual information for the viewing location. The routine then proceeds to
53

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
block 430 in order to perform a panorama image connection subroutine
(with one example of such a routine illustrated in Figures 6A-6B, as
discussed further below) in order to determine inter-connections between
the panorama images for the interior of the target building of interest, such
as based on performing image/feature matching and/or using captured
linking information.
[0090] it will be appreciated that, despite the categorical method of
processing depicted in Figure 4 in which all panorama images are
generated based on analysis of recorded viewing location information prior
to determining relative positional information for viewing locations, any
order
for such processing may be implemented in accordance with the techniques
described herein. For example, the routine may instead process individual
segments of captured information sequentially, such that a panorama image
is generated for a first viewing location, followed by processing of linking
information captured during movement away from that first viewing location
to determine relative positional information for a second viewing location; a
panorama image generated for the second viewing location, followed by
processing of linking information captured during movement away from that
second viewing location to determine relative positional information for a
third viewing location; etc. In various embodiments, processing of captured
information for one or many building interiors may be performed in a parallel
and/or distributed manner, such as by utilizing one or more parallel
processing computing clusters (e.g., directly by the BICA system or via one
or more third-party cloud computing services).
[0091] After block 430, the routine continues to block 435, and creates and

stores a representation of the captured building interior based on the
panorama images generated in block 425 that are linked using the relative
positional information for the multiple viewing locations determined in block
430. In particular, and as described elsewhere herein, each panorama
image (corresponding to one viewing location within the building interior) is
associated with information reflecting one or more user-selectable links to
one or more other of the viewing locations, such that selection of a user-
selectable link while viewing a panorama image associated with one viewing
54

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
location initiates display of a distinct other panorama image associated with
another viewing location.
[0092] If it was determined in block 420 that the instructions or
information
received in block 405 did not indicate to receive acquired building interior
data, the routine proceeds to block 485, in which it determines whether the
received instructions or information include an indication to present a
previously stored representation of a building interior. If so, the routine
proceeds to block 440 to perform a building interior representation
presentation subroutine, to cause a display or other presentation of a
created representation of a target building interior (such as via a client
computing system of an end user, and with one example of such a routine
illustrated in Figure 7, as discussed further below).
[0093] If it was determined in block 485 that the instructions or
information
received in block 405 did not include an indication to present a previously
stored representation of a building interior, control passes to block 490 to
perform any other indicated operations as appropriate, such as any
housekeeping tasks, to obtain and store information about users of the
system, to configure parameters to be used in various operations of the
system, etc.
[0094] Following blocks 435, 440, or 490, the routine proceeds to block
495
to determine whether to continue, such as until an explicit indication to
terminate. If it is determined to continue, control returns to block 405 to
await additional instructions or information, and if not proceeds to step 499
and ends.
[0095] Figure 5
illustrates an example flow diagram for an embodiment of a
Building Interior Data Acquisition routine 500. The routine
may be
performed by, for example, execution of the Building Interior Data
Acquisition manager 262 of Figure 1B, the Building Interior Data Acquisition
manager component 342 of Figure 3, the BICA application 155 of Figure 1A,
and/or the BICA system discussed with respect to Figures 2A-2I, such as to
acquire information about a building interior, including visual information
for
each of multiple viewing locations and linking information from travel
between at least some of the viewing locations. In addition, the routine may
be initiated in various manners in various embodiments, such as from block

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
415 of Figure 4, or instead in some embodiments by a user interacting
directly with a mobile device to initiate video recording inside a building
interior (e.g., by a local BICA application on the mobile device; if the video

recording of the building interior is performed as one or more videos without
the use of a local BICA application, and with corresponding IMU linking
information later retrieved from the mobile device; etc.).
[0096] The illustrated embodiment of the routine begins at block 510, in
which the routine initiates recording video and/or sensor data at a first
viewing location within the building interior as a mobile device with imaging
capabilities is rotated around a vertical axis located at the first viewing
location. In addition, the routine may, in some embodiments, optionally
monitor the motion of the mobile device during the recording at the first
viewing location, and provide one or more guidance cues to the user
regarding the motion of the mobile device, quality of the video being
recorded, associated lighting/environmental conditions, etc. At block 515,
the routine determines that video recording of the viewing location is
completed. As discussed elsewhere herein, such determination may be
based on an explicit indication from a user of the mobile device, or may be
automatically determined based on one or more of an analysis of sensor
data, the video being recorded, the user remaining substantially motionless
for a defined period of time, etc. At block 520, the routine optionally
obtains
annotation and/or other information from the user regarding the captured
viewing location. For example, in certain embodiments the BICA system
may record audible or textual annotations from the user to further describe
the viewing location (e.g., to provide a label or other description of the
viewing location, to describe aspects of the viewing location that the
recorded video or sensor data may not adequately capture, etc.), such as
for later use in presentation of information about that viewing location.
[0097] After blocks 515 and 520, the routine proceeds to block 525 to
initiate the capture of linking information (including acceleration data)
during
movement of the mobile device along a travel path away from the current
viewing location and towards a next viewing location within the building
interior. As described elsewhere herein, the captured linking information
may include additional sensor data, as well as additional video information,
56

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
recorded during such movement. Initiating the capture of such linking
information may be performed in response to an explicit indication from a
user of the mobile device or based on one or more automated analyses of
information recorded from the mobile device. In addition, and in a manner
similar to that noted with respect to capturing the first viewing location in
block 510, the routine may further optionally monitor the motion of the
mobile device in some embodiments during movement to the next viewing
location, and provide one or more guidance cues to the user regarding the
motion of the mobile device, quality of the sensor data and/or video
information being captured, associated lighting/environmental conditions,
advisability of capturing a next viewing location, and any other suitable
aspects of capturing the linking information. Similarly, the routine may
optionally obtain annotation and/or other information from the user regarding
the travel path, such as for later use in presentation of information
regarding
that travel path or a resulting inter-panorama connection link.
[0098] At block 530, the routine determines that the mobile device has
arrived at a next viewing location after the user travel path segment to which

the linking information corresponds, for use as a new current viewing
location. As
described in greater detail elsewhere herein, such
determination may be based on one or more of an explicit user request, an
analysis of incoming sensor data, recorded video information, the user
remaining substantially motionless for a defined period of time, etc. In
response to the determination, the routine proceeds to block 535 to initiate
capture of the current viewing location in a manner similar to that described
for blocks 510-520 with respect to the first viewing location. In particular,
the routine initiates recording of video and/or sensor data at the current
viewing location within the building interior as the mobile device is rotated
around a vertical axis located at the current viewing location, optionally
monitoring the recording to provide one or more guidance cues to the user
regarding the capture process. At block 540, in a manner similar to that
noted with respect to block 515 for the first viewing location, the routine
determines that recording of the current viewing location is completed, such
as based on an explicit indication from a user, and/or one or more analyses
of information from the mobile device. At block 545, the routine optionally
57

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
obtains annotation and/or other information from the user regarding the
captured viewing location and/or the travel path from the previous viewing
location, such as audible or textual annotations from the user to further
describe the viewing location or travel path, such as for later use in
presentation of information regarding that viewing location and/or use of that

travel path.
[0099] The routine proceeds to block 555 to determine whether all viewing
locations within the building interior that have been selected by the user
have been captured, such as based on an express request by a user of the
mobile device to terminate the capturing process or, alternatively, a
determination that the capturing process is to continue (such as via analysis
of acceleration or other sensor data indicating that the mobile device is
moving to a subsequent viewing location). If it is determined that the
capturing process is to continue ¨ i.e., that not all viewing locations for
the
building interior have yet been captured by the mobile device ¨ the routine
returns to block 525 in order to capture linking information during movement
of the mobile device to the next viewing location in sequence within the
building interior. Otherwise, the routine proceeds to block 560 to optionally
analyze viewing location information, such as in order to identify possible
additional coverage (and/or other information) to acquire within the building
interior. For example, the BICA system may provide one or more
notifications to the user regarding the information acquired during capture of

the multiple viewing locations and corresponding linking information, such
as if it determines that one or more segments of the recorded information
are of insufficient or undesirable quality to serve as the basis for
generating
a panorama image, or do not appear to provide complete coverage of the
building, or would provide information for additional inter-panorama links.
[ow 00] After block 560, the routine proceeds to block 590 to store the
acquired data and/or to transmit the acquired data from the mobile device to
a remote BICA system (such as for analysis and/or storage by the remote
BICA system for future use). The routine then proceeds to block 599 and
ends. In situations in which the routine 500 is invoked from block 415 of
Figure 4, the routine will then return to block 417 of Figure 4, including to
provide the acquired building interior data to that routine 400.
58

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[ow on Figures 6A-6B illustrate an example flow diagram for an
embodiment of a Panorama Connection routine 600. The routine may be
performed by, for example, execution of the Panorama Connection manager
264 of Figure 1B, the Panorama Connection manager component 346 of
Figure 3, the BICA application 155 of Figure 1A, and/or the BICA system
discussed with respect to Figures 2A-2I, such as to determine inter-
panorama connection information based on using captured linking
information and/or image/feature matching. In addition, the routine may be
initiated in various manners in various embodiments, such as from block
430 of Figure 4.
[00102] In the illustrated embodiment, the routine begins at block 605,
where
a next pair of panorama images is selected to be analyzed for inter-
connection information, beginning with a first pair that includes the first
and
second panorama images corresponding to the first and second viewing
locations in a sequence of multiple viewing locations within a house, building

or other structure. The routine then continues to block 610 to determine
whether to attempt to determine connection information between the pair of
panorama images via image/feature matching, such as based on overlap of
features in images/frames from the two panorama images, and if so,
continues to block 615. It will be appreciated that in some embodiments,
connection determination via image/feature matching may not be
performed, such as if all connection information between pairs of panorama
images is determined using captured linking information, as discussed in
greater detail with respect to blocks 655-670.
[001 031 In the illustrated embodiment, the routine in block 615 begins by
optionally filtering pairs of frames/images from the panorama images (e.g.,
corresponding to individual frames from a video used to construct the
panorama images) that do not have sufficient overlapping coverage,
although in other embodiments each image/frame in one of the two
panoramas may be compared to each image/frame in the other of the two
panorama images to determine an amount of overlap, if any, between the
pair of images. In the illustrated embodiment, the routine continues to block
620 from block 615, where it matches non-filtered pairs of frames/images
from the two panorama images with overlapping coverage using one or both
59

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
of essential matrix and/or homography matrix decomposition processing
techniques, although other processing techniques may be used in other
embodiments. In addition, the routine may optionally select in block 620
whether to retain and use results for each pair from only one of essential
matrix processing and homography matrix decomposition processing if both
are performed, such as depending on whether information in the pair of
frames corresponds to a flat planar surface or instead as information in a 3D
space. In other embodiments, results from both essential matrix processing
and homography matrix decomposition processing may be retained and
used, or instead only one of the two (and possibly other) types of processing
may be used. The routine further continues in block 620 to determine
relative rotation and translation/distance between the viewing locations for
the two panorama images from the results of the one or more processing
techniques, optionally by combining results from multiple matching
image/frame pairs to determine aggregate consensus inter-panorama
connection information, and optionally computing a confidence value in the
resulting information, as discussed in greater detail elsewhere herein.
[001041 After block 620, the routine continues to block 625 to determine
whether to attempt to also connect the two panorama images via analysis of
captured linking information along a travel path that the user took between
the viewing locations corresponding to the two panorama images. If so, or if
it is instead determined in block 610 to not attempt to connect the two
panorama images via image matching, the routine continues to perform
blocks 650-670 to use such linking information to determine relative rotation
and location/direction/distance between the panorama images. In particular,
the routine determines in block 650 whether the two panorama images are
consecutive images in the sequence, such that linking information is
available for a travel path that the user travels between the two viewing
locations corresponding to the two panorama images, and if not continues to
block 630. Otherwise, the routine continues to block 655 to obtain that
linking information for that travel path, including acceleration data from the

mobile device IMU sensor unit(s), and optionally video information as well if
available.

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[00105] After block 655, the routine continues to block 660 to determine
the
departure direction of leaving the viewing location corresponding to the start

panorama image and the arrival direction of arriving at the viewing location
of the end panorama image, using video information if available to match
initial video information for the departure to one or more corresponding
frames of the start panorama image and to match final video information for
the arrival to one or more corresponding opposite-side frames of the end
panorama image. If video information is not available, leaving and arrival
directions may be determined in other manners, such as based solely on
analysis of the captured acceleration data and/or other location information
for the mobile device. After block 660, the routine continues to block 665 to
analyze the acceleration data in the captured linking information along the
travel path - in particular, for each acceleration data point, a double
integration operation is performed to determine first velocity and then
location corresponding to that acceleration data point, including in the
illustrated embodiment to determine corresponding velocity and location for
each of x, y, and z axes in three dimensions. In block 670, the routine then
combines the determined velocity and location for each of the acceleration
data points to form a modeled travel path, along with the determined
leaving/arriving directions, and uses the resulting information to determine
relative rotation and location/distance between the panorama images,
optionally with a corresponding confidence value. Additional details related
to the analysis and use of such linking information is discussed in greater
detail elsewhere herein.
[00106] After block 670, or if it instead determined in block 650 that the
two
panorama images do not have captured linking information for a travel path
between them, the routine continues to block 630 to, if connection
information is available from both image matching and linking information,
combine the information into a final determined aggregate relative direction
and distance/location for the panorama images, along with the resulting
confidence value from the combination. After block 630, or if it is instead
determined in block 625 to not use linking information to connect the two
panorama images, the routine continues to block 635 to, for each panorama
in the pair and based on the determined relative position information,
61

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
determine a direction of the other panorama relative to the current
panorama starting point, identify one or more frames in the current
panorama that correspond to that determined direction, and store
information for the current panorama about an inter-panorama link to the
other panorama for those one or more frames.
[00107] After block 635, the routine continues to block 645 to
determine
whether there are more pairs of panorama images to analyze, and if so,
returns to block 605 to select the next such pair. In some embodiments,
each consecutive pair of panorama images in the sequence of viewing
locations is analyzed, and then some or all other pairs of panorama images
that do not have corresponding linking information based on a travel path
between those viewing locations are considered, so as to determine and
provide inter-panorama connection information for all pairs of panorama
images for which information is available. As discussed in greater detail
elsewhere herein, in some embodiments, some links between pairs of
panoramas may not be provided even if they may be calculated, however,
such as to provide inter-panorama links upon display to an end user only for
a subset of panorama pairs (e.g., corresponding to panorama pairs that are
visible to each other, or near each other within a defined distance, or
otherwise satisfy one or more specified criteria).
[00108] If it is instead determined in block 645 that there are no more
pairs of
panorama images to consider, the routine continues to block 690 to
optionally perform a global review of the respective panorama locations and
the relative directions between them for overall consistency, and to update
that determined information as appropriate, as discussed in greater detail
elsewhere. If so, such
an update may include updating the stored
information for one or more panoramas about one or more inter-panorama
links from that panorama to one or more other panoramas. After block 690,
the routine continues to block 695 to provide information about the
determined linked panorama images, and continues to block 699 and ends.
In situations in which the routine 600 is invoked from block 430 of Figure 4,
the routine will then return to block 435 of Figure 4, including to provide
the
information about the determined linked panorama images to that routine
400.
62

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[00109] Figure 7
illustrates an example flow diagram for an embodiment of a
Building Interior Representation Presentation routine 700. The routine may
be performed by, for example, execution of the Building Interior
Representation Presentation manager 268 of Figure 1B, the Building Interior
Representation Presentation manager component 344 of Figure 3, the BICA
application 155 of Figure 1A, and/or the BICA system discussed with
respect to Figures 2A-2I, such as to display or otherwise present information
about the generated representation of the interior of one or more target
buildings. In addition, the routine may be initiated in various manners in
various embodiments, such as from block 440 of Figure 4, or instead in
some embodiments by an end user interacting with his or her client device
to obtain (e.g., retrieve from a remote location over one or more networks
and/or from local storage) information about one or more linked panorama
images representing a building interior. While the illustrated embodiment
includes the linked panorama images representing or covering a single
house, building or other structure, in other embodiments the linked
panoramas or other linked visual information may extend beyond a single
such structure, as discussed in greater detail elsewhere herein.
[001101 The example embodiment of the routine begins at block 705, in
which a user request is received for displaying of presentation information
regarding a specified building interior that has been previously captured. In
response to the user request, the routine proceeds to block 710 to retrieve
stored presentation information regarding the specified building interior.
Once the presentation information is retrieved, the routine proceeds to block
715, and causes a client computing system associated with the user request
to display an initial panorama image corresponding to a determined first
viewing location within the specified building interior, as well as to display

indications of one or more visual inter-panorama links to corresponding
additional viewing locations, such as by transmitting information to the
client
computing system that includes at least the initial panorama image and its
inter-panorama links (and optionally corresponding information for some or
all other panorama images for the building). As described elsewhere herein,
the initial panorama image may or may not correspond to the viewing
location first captured within the specified building interior. In addition,
it will
63

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
be appreciated that an end user may use various local controls to
manipulate the initial panorama image in various manners, such as to move
horizontally and/or vertically within the panorama image to display different
views (e.g., different directions within the building from the viewing
location
to which the initial panorama image corresponds), to zoom in or out, to
apply various filters and/or otherwise adjust the quality or type of
information
displayed (e.g., if the initial panorama image is constructed from one or
more rotations at the viewing location that use different settings or
otherwise
acquire different types of data, such as one rotation that captures visible
light, another rotation that captures infrared light/energy, another rotation
that captures ultraviolet light/energy, etc.).
[001111 At block 720, after the end user is done with the initial panorama
image, the routine determines whether the end user has selected one of the
provided links associated with the displayed panorama image, or has
instead indicated that the end user is done (e.g., closed the current
panorama image and/or its local viewing application on the client system). If
the end user is done, the routine continues to block 799 and ends.
Otherwise, responsive to the end user selection of one of the displayed
links, at block 725 the routine causes the associated client computing
system to display a distinct additional panorama image (or other
information) corresponding to the selected link in a manner similar to that
described with respect to block 715, as well as to display indications of one
or more additional links to corresponding additional viewing locations as
appropriate for the additional panorama image ¨ as part of doing so, the
server system providing the building representation information may
optionally transmit additional corresponding information to the client
computing system at that time in a dynamic manner for display, or the client
computing system may instead optionally retrieve information that was
previously sent with respect to block 715 and use that. After block 725, the
routine returns to block 720 to await an indication of another user selection
of one of the user-selectable links provided as part of the presentation, or
to
otherwise indicate that the end user is done. In situations in which the
routine 700 is invoked from block 440 of Figure 4, the routine will then
return
to block 495 of Figure 4 when it reaches block 799.
64

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
[00112] Aspects of
the present disclosure are described herein with
reference to flowchart illustrations and/or block diagrams of methods,
apparatus (systems), and computer program products according to
embodiments of the present disclosure. It will be appreciated that each
block of the flowchart illustrations and/or block diagrams, and combinations
of blocks in the flowchart illustrations and/or block diagrams, can be
implemented by computer readable program instructions. It will be further
appreciated that in some implementations the functionality provided by the
routines discussed above may be provided in alternative ways, such as
being split among more routines or consolidated into fewer routines.
Similarly, in some implementations illustrated routines may provide more or
less functionality than is described, such as when other illustrated routines
instead lack or include such functionality respectively, or when the amount
of functionality that is provided is altered. In
addition, while various
operations may be illustrated as being performed in a particular manner
(e.g., in serial or in parallel, or synchronous or asynchronous) and/or in a
particular order, in other implementations the operations may be performed
in other orders and in other manners. Any data structures discussed above
may also be structured in different manners, such as by having a single data
structure split into multiple data structures or by having multiple data
structures consolidated into a single data structure. Similarly, in some
implementations illustrated data structures may store more or less
information than is described, such as when other illustrated data structures
instead lack or include such information respectively, or when the amount or
types of information that is stored is altered.
[00113] From the foregoing it will be appreciated that, although
specific
embodiments have been described herein for purposes of illustration,
various modifications may be made without deviating from the spirit and
scope of the invention. Accordingly, the invention is not limited except as by

corresponding claims and the elements recited by those claims. In addition,
while certain aspects of the invention may be presented in certain claim
forms at certain times, the inventors contemplate the various aspects of the
invention in any available claim form. For example, while only some
aspects of the invention may be recited as being embodied in a computer-

CA 03069813 2020-01-13
WO 2019/014620 PCT/US2018/042130
readable medium at particular times, other aspects may likewise be so
embodied.
66

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date 2021-07-06
(86) PCT Filing Date 2018-07-13
(87) PCT Publication Date 2019-01-17
(85) National Entry 2020-01-13
Examination Requested 2020-01-13
(45) Issued 2021-07-06

Abandonment History

There is no abandonment history.

Maintenance Fee

Last Payment of $277.00 was received on 2024-05-24


 Upcoming maintenance fee amounts

Description Date Amount
Next Payment if standard fee 2025-07-14 $277.00
Next Payment if small entity fee 2025-07-14 $100.00

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Registration of a document - section 124 2020-01-13 $100.00 2020-01-13
Registration of a document - section 124 2020-01-13 $100.00 2020-01-13
Application Fee 2020-01-13 $400.00 2020-01-13
Request for Examination 2023-07-13 $800.00 2020-01-13
Maintenance Fee - Application - New Act 2 2020-07-13 $100.00 2020-07-08
Notice of Allow. Deemed Not Sent return to exam by applicant 2020-10-14 $400.00 2020-10-14
Final Fee 2021-07-09 $514.08 2021-05-18
Maintenance Fee - Application - New Act 3 2021-07-13 $100.00 2021-06-07
Registration of a document - section 124 2021-06-10 $100.00 2021-06-10
Maintenance Fee - Patent - New Act 4 2022-07-13 $100.00 2022-06-27
Registration of a document - section 124 $100.00 2023-01-25
Registration of a document - section 124 $100.00 2023-05-01
Registration of a document - section 124 $100.00 2023-05-01
Maintenance Fee - Patent - New Act 5 2023-07-13 $210.51 2023-05-31
Maintenance Fee - Patent - New Act 6 2024-07-15 $277.00 2024-05-24
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
MFTB HOLDCO, INC.
Past Owners on Record
PUSH SUB I, INC.
ZILLOW GROUP, INC.
ZILLOW, INC.
ZILLOW, LLC
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Abstract 2020-01-13 2 93
Drawings 2020-01-13 17 571
Description 2020-01-13 66 3,567
Representative Drawing 2020-01-13 1 40
Patent Cooperation Treaty (PCT) 2020-01-13 1 42
International Preliminary Report Received 2020-01-13 60 2,798
International Search Report 2020-01-13 2 79
Declaration 2020-01-13 2 39
National Entry Request 2020-01-13 14 399
Claims 2020-01-13 21 981
International Preliminary Report Received 2020-01-13 21 941
PPH Request 2020-01-13 6 289
PPH OEE 2020-01-13 97 6,555
Final Fee 2021-05-18 5 133
Description 2020-01-14 66 3,647
Examiner Requisition 2020-02-19 4 170
Cover Page 2020-02-28 1 61
Amendment 2020-05-06 28 1,122
Description 2020-05-06 66 3,634
Claims 2020-05-06 21 869
Withdrawal from Allowance / Amendment 2020-10-14 58 2,530
Claims 2020-10-14 51 2,314
Examiner Requisition 2020-10-27 4 205
Amendment 2020-11-06 111 5,577
Claims 2020-11-06 51 2,296
Representative Drawing 2021-06-15 1 21
Cover Page 2021-06-15 1 62
Electronic Grant Certificate 2021-07-06 1 2,527